praeclarumjj3 / VQ-VAE-on-MNIST

VQ-VAE implementation in Pytorch
19 stars 8 forks source link

image generation #2

Closed awrm20 closed 3 years ago

awrm20 commented 3 years ago

The quality of generated images is not good, How we can get exact images as the z is passed as an input to the mode;? thnks

praeclarumjj3 commented 3 years ago

Hi @awrm20. Thanks for your question. There are several possible reasons for not so good results using random noise vector:

awrm20 commented 3 years ago

For first option I have tried to increase the dimensions of codebook, in my custom dataset it didn’t produce significant results just a noise. Secondly my dataset have same dimensions as MNIST have. We can look into third option for better results.

On Tue, 22 Jun 2021 at 9:37 pm, Jitesh Jain @.***> wrote:

Hi @awrm20 https://github.com/awrm20. Thanks for your question. There are several possible reasons for not so good results using random noise vector:

  • You can try to increase the dimensionality of the codebook or tune other hyperparameters to see if the performance improves.
  • Also, the size of the MNIST dataset is tiny for successful generation from random noise. You can try training on larger datasets like ffhq or imagenet.
  • Try to incorporate PixelCNN or PixelRNN into the pipeline as mentioned in the paper for better results.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/praeclarumjj3/VQ-VAE-on-MNIST/issues/2#issuecomment-865908582, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUSM3DZ4K5BRFWRNHRNUBU3TUBYY7ANCNFSM47DOVIUQ .

praeclarumjj3 commented 3 years ago

For first option I have tried to increase the dimensions of codebook, in my custom dataset it didn’t produce significant results just a noise. Secondly my dataset have same dimensions as MNIST have. We can look into third option for better results. On Tue, 22 Jun 2021 at 9:37 pm, Jitesh Jain @.***> wrote: Hi @awrm20 https://github.com/awrm20. Thanks for your question. There are several possible reasons for not so good results using random noise vector: - You can try to increase the dimensionality of the codebook or tune other hyperparameters to see if the performance improves. - Also, the size of the MNIST dataset is tiny for successful generation from random noise. You can try training on larger datasets like ffhq or imagenet. - Try to incorporate PixelCNN or PixelRNN into the pipeline as mentioned in the paper for better results. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AUSM3DZ4K5BRFWRNHRNUBU3TUBYY7ANCNFSM47DOVIUQ .

Well, it might be a better idea to use a larger dataset. MNIST has only 50k images, number of images around 250k should improve performance.

On the other hand, yes adding PixelCNN or PixelRNN will aid the image generation.