Model producing blank images with dataset of 128x128 or larger images

dpmcgonigle commented 4 years ago

Hello,

I have tried training several 256x256 datasets with the Glow model using hyperparameter configurations as similar to the CelebA configuration that your team used, and all I'm getting are blank images when trying to generate samples with any temperatures(standard deviation).

For instance, I have LSUN-Tower 256x256 training right now (~708K images), on epoch 800 (as epoch is defined by this code base; approaching an entire pass through the entire dataset). For this experiment I am using 6 Levels, with 32 Steps per level, learning rate 0.001 with local_batch_train of 2 images per GPU (using 4 GPUs), which is derived from n_batch 128 using the default anchor of 32, and I am using the affine coupling layer. There is no y-conditioning for the experiments I'm running (all of the same category), and I’m using n_bits_x of 8.

I have run several other experiments similar to this with other datasets, and additive coupling layer, for longer times, and all have produced blank image samples. It appears any dataset larger than 64x64 is giving me trouble. Is this something that your team has run into at all with larger images? I'm wondering if it will eventually "break out" of this issue if I let it run long enough, or if I need to tweak the hyperparameters.

Thank you very much for your time.

Sincerely, Dan McGonigle

guilherme-pombo commented 4 years ago

Hello, I'm having a very similar problem to @dpmcgonigle , even when using the exact same setup as the original paper. Would be amazing to get some clarification as to the possible issues behind this problem. Btw, @dpmcgonigle have you been able to resolve this in the meanwhile?

dpmcgonigle commented 4 years ago

@guilherme-pombo , this is something that we have not yet resolved. I should have noted that both the training loss and test loss continue to decrease even when sampling blank images during training, interestingly enough. I have noticed that with some of our experiments, there are all-blank sets of sample images for a number of "epochs" before actually "breaking out" and producing good images. So, I'm wondering if it is just a matter of time before "good images" are produced. One piece of evidence against this hypothesis I think is that the trained model producing all blank images is not invertible from Z -> X -> Z with a successive model.decode() then model.encode(), though it is invertible in the other direction X -> Z -> X with model.encode() on a real image, followed by a model.decode(). I am very curious if anyone else may have insight into this problem.

dpmcgonigle commented 4 years ago

@guilherme-pombo, just wanted to let you know I've had limited success using:

L2 regularization
More homogenous datasets, where the images have some distinct structure

Hzzone commented 3 years ago

I have successfully reproduced the results shown in the paper with CelebA, using PyTorch. I trained the model with bs=64 and about 10 days. The fig below is the result images with a size of 256. Note that it would be better if I trained the model on CelebA-HQ as celeba is more challenging. A few months later, I will release my source code and the pre-trained model for anyone who is interested.

LeeeLiu commented 3 years ago

I have successfully reproduced the results shown in the paper with CelebA, using PyTorch. I trained the model with bs=64 and about 10 days. The fig below is the result images with a size of 256. Note that it would be better if I trained the model on CelebA-HQ as celeba is more challenging. A few months later, I will release my source code and the pre-trained model for anyone who is interested.

Hello, would you please share your trained checkpoints for 256×256？ I am so GPU hungry. thanks a lot !!!

Hzzone commented 3 years ago

I have successfully reproduced the results shown in the paper with CelebA, using PyTorch. I trained the model with bs=64 and about 10 days. The fig below is the result images with a size of 256. Note that it would be better if I trained the model on CelebA-HQ as celeba is more challenging. A few months later, I will release my source code and the pre-trained model for anyone who is interested.

Hello, would you please share your trained checkpoints for 256×256？ I am so GPU hungry. thanks a lot !!!

I am glad to hear you are interested in the pre-trained model. I have submitted it elsewhere. As a result, it would be public once I release my paper on the arxiv. Anyway, it would be not so long until my paper was accepted or not.

Hzzone commented 3 years ago

@@@@@@@ My paper has been accepted by IJCAI 2021. Waiting for my model!!!!

LeeeLiu commented 3 years ago

@@@@@@@ My paper has been accepted by IJCAI 2021. Waiting for my model!!!!

Congratulations! Looking forward to your paper link and trained model.

By the way, is the reversibility of your glow model good? That's to say, when glow do inference z->X->z' ,
z and z' are completely the same. (In official openAI code, it seems that the reversibility not good)

Hzzone commented 3 years ago

If the exact inputs are desired to be constructed, the output z of different scales should be gathered.

Hzzone commented 3 years ago

Code for pre-training, and the pre-trained models on the image size of 256, are now available at https://github.com/Hzzone/AgeFlow.

openai / glow

Model producing blank images with dataset of 128x128 or larger images #98