Closed XGraphKhipu closed 3 months ago
Thank you!
This is explained in Section 3.3 and Appendix D of the preprint (https://arxiv.org/pdf/2403.14404). You can always add more steps to generate a higher-quality sample. However, this requires additional forward passes which increases the training time quite a bit (feel free to try this out - simply increase the reduced_ddim_steps
). Remember that we also have to backpropagate through all these forward passes to obtain the correct gradients. In our investigations, even a two-step DDIM sampling (from t to 1, and 1 to 0) yielded good results, and the trade-off between training time increase (for more sampling timesteps) and benefits in residual matching was not really worth it.
Lastly, we also increase the variance of the residual likelihood for larger timesteps to not penalize the model as much on samples that were estimated by a rather coarse sampling.
Hope this helps, let me know if unclarities remain.
Hi, great job and interesting work!
I have a couple of questions regarding the
main_toy.py
script:1.- Why is there only one iteration in the
ddim_sample_x0
function? Since it is supposed to be a sampling process in DDIM, why is there only one iteration?2.- What would happen if more steps were added? Why aren't there more steps?