Open stsavian opened 1 year ago
There are much more considerations to make when implementing these results for large scale dataset, like CelebA.
Please don't expect it to work stably anything beyond MNIST and perhaps CIFAR10 examples. I'm wondering why CIFAR10 didn't work for you... you should be able to reproduce the results in the readme...
Stuff that you should probably do to get this code to work beyond toy dataset:
@cloneofsimo thanks so much for your advice!
I couldn't reproduce the results for CIFAR10. Strangely, running the code from this repo gives me only noise images (trying to find the issue). Given that it works for MNIST, and the only different is the UNet, I'd guess it might be something there...
Dear @cloneofsimo and @SilenceMonk, thanks so much for this code! It is beneficial and precisely the missing piece I need to understand diffusion models better. Also, I appreciated that I could just run the CIFAR10 training without any code modification. I am playing around with your code to better understand the guided_diffusion repository, which I find too complex and I need to simplify.
I have trained on cifar10 and obtained the following results after 100 epochs.
As you can see, the prediction quality seems quite far from the ground truth. I plan to extend your code to images with a larger resolution, however, I am hesitant now, as I do not understand if the network is learning or not. I would like to extend the code while maintaining convergence.
i) Is this behavior normal? Is there some critical hyperameter to tune to obtain clearer images?
UPDATE: I have trained on celebA and obtained the following results after 21 epochs (approx 14 hours on a 3090): The celebA results seem already better than the cifar10, but I might need more training epochs because the generated images are still far from the groundtruth.
Still referring to the celebA results, you can see in the following image that the generated images could show a constant color, background. (below you can see celebA after 19 epochs) This issue is similar to https://github.com/openai/guided-diffusion/issues/81 .
Furthermore, you can see that the training does not progress linearly, if we take epoch 22 of celebA, we can notice that the network outputs smooth predictions with no structure again.
So overall I am not getting the training stability I was expecting. These results are (unfortunately) consistent with my issues for the guided_diffusion repository https://github.com/openai/guided-diffusion/issues/42 .
iii) do you have any comment which could help overcame this issue?
Thanks again for your help! Stefano