rosinality / vq-vae-2-pytorch

Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch
Other
1.59k stars 271 forks source link

Training Hypterparameters of PixelSnail for VQ-VAE experiments #49

Open fostiropoulos opened 3 years ago

fostiropoulos commented 3 years ago

I am using 4x Nvidia V100 and I am not able to get a batch size larger than 32 for the hyperparameters of this paper for training on the top codes. I have also changed the loss to discretized mixtures of logistics similar to the actual PixelCNN++ and PixelSnail implementation. The authors mention a batch size of 1024 which seems unreal to reach. Does this implementation of PixelSnail use more layers than the one reported in the VQVAE2 paper?

I am not able to make the mapping between this implementation and the one described in the appendix of VQVAE 2 to correctly configure it to replicate their results. Any help appreciated.

image

rosinality commented 3 years ago

Actually the network used in the paper is much larger than the default model in this implementation.

fostiropoulos commented 3 years ago

Yes I would initially have thought so. I can only think of being able to train such a large model on a TPU. Do you have any insights on how it could have been done?

rosinality commented 3 years ago

Maybe they have used tpus or large amount of gpus. Anyway replicating the model training in the paper will be very hard (actually practically impossible) with a few number of gpus.