Closed manurare closed 1 year ago
Oh, I don't have enough computational resource for cifar10 to make it converge~~~ I never tried iDDPM on ImageNet(32,64,128,224,256)
For such high-resolution images, I think the better/faster method is stable diffusion / latent score matching? VQ-VAE + latent space diffusion https://github.com/CompVis/latent-diffusion or NVAE + latent score matching Score-based Generative Modeling in Latent Space http://arxiv.org/abs/2106.05931
Again, I can't train on ImageNet even training a classifier.
I see thanks. I was able to make it converge at 32x32 but I am not able to do it at 256x256
Hi,
First of all thanks for the code! I was wondering if you were able to make it converge on 256x256 images. Specifically I am using ImageNet (a smaller version of it with 9469 samples). I cannot quite make it generate plausible samples. I am using T=1000, cosine scheduler and
LR=1e-4
with warm up. I tried both predictingepsilon
andxstart
but both give weird samples as I attach here. Do you have any tip/suggestion on what could be going wrong or how to improve sample quality? Maybe 1000 are not enough timesteps but then sampling would be much slower :SThanks!