Open dome272 opened 2 years ago
This is weird. (predicted noise -> x_0 -> x_t-1) uses eq. 9, and your implementation uses eq.10. I've verified and they are mathematically equivalent.
I'd suggest you to check the first few iterations (largest t) to see if the two routines produce very similar numbers.
Hey @dome272 , I am not sure why the code framework here does not work with the equation you referenced in the paper, and I have not had time to look into in depth, but someone else developed a really nice google colab file that implements the DDPM algorithm step-by-step similar to this code-base, and they do use the equation in the algorithm you referenced above. I have tested their code myself, and it gives good-looking outputs, so I think it could indicate that some detail is not correct with this implementation of the DDPM paper. Linked below is the google colab framework I referenced earlier, feel free to try/experiment with it yourself.
Hi,
I was wondering why every diffusion models implementation uses this specific sampling procedure? When I take a look at the DDPM paper they show the sampling algorithm to be:
However, it seems that no implementation follows that and rather takes a really complicated route of first predicting the noise, then calculating x_0, then the mean and logvariance and then construct x_t-1 from that.
I implemented the above algorithm while using your codebase:
But the results are just gray images with a bit of shape and colour: (top is the normal sampling, like your code, bottom is using the above sampling function)
Do you have any idea why this kind of sampling does not work?