rosinality / vq-vae-2-pytorch

Implementation of Generating Diverse High-Fidelity Images with VQ-VAE-2 in PyTorch
Other
1.6k stars 270 forks source link

PixelSNAIL overfitting issue #66

Open vipul109 opened 3 years ago

vipul109 commented 3 years ago

Hi , First of all thanks for the implementation.

I have tried to train PixelSNAIL-bottom/top prior for 256(imagenet) and 512(gaming) resolution images but I found that both the models are causing overfitting issue .

Bottom-prior (Average train accuracy = 0.77 , validation accuracy: 0.67, test accuracy: 0.37), where train and validation split(9:1) from same datset of 5k images of 512*512 , testing data is another dataset of same class.

Top-prior (Average train accuracy = 0.97180 , validation: 0.88 , testing accuracy: 0.4 ) rest of the settings are same as bottom prior.

I have tried to use l2 regularization, augmentation dataset along with existing dropout but no success. Any lead would be helpful. Thanks in advance.

rosinality commented 3 years ago

Large gap between validation and test is somewhat strange. Anyway, I think there will be not many methods that can be applied to reduce such large overfitting.

vipul109 commented 3 years ago

Rosinality, Thanks for the reply. Could you point me to the public datasets which have worked for you? Did you experiment with FFHQ dataset?

rosinality commented 3 years ago

I have tried to train on FFHQ.

vipul109 commented 3 years ago

Rosinality, For FFHQ, Did you follow the setting mentioned in the paper ? What is the best accuracy you achieved?

rosinality commented 3 years ago

No, model in the paper is too large to use in my environments. In my cases I got 45% training accuracies for top level codes.

wwlCape commented 3 years ago

Hi,Thank you for your VQ-vae2 PyTorch version! If I want to achieve the results in the paper, what should I do? Just change the holistic architecture bottom+top to bottom+middle+top? Thank you for your reply!

rosinality commented 3 years ago

@wwlCape If you want to try 1024 model then you need to use bottom + middle + top models, and larger pixelsnail model. But I don't know this repository can replicate the results in the paper.

wwlCape commented 3 years ago

OK, Thanks for your reply!