Closed Li-Tong-621 closed 1 year ago
Hi, thanks for your interest! This should be somehow occurring only during the early stages of training I believe. Your question might be related to the training process of diffusion models in general I believe. Perhaps you can monitor this in the later stages of training (e.g., currently models are trained ~2M iterations) and inform if this is still observed.
Thanks for your reply~. In my observation, in the later stage, the loss function usually perturbs in a range, such as between 50 and 200, which may have converged.
This is generally what I observed as well, which mostly gave great reconstruction quality (in terms of PSNR/SSIM). So I believe that seems consistent with your conclusion.
Thank you again and congratulations that your paper has been accepted!
Thanks for your interesting work! I find loss shocks violently,so i would like to know how we can prove that the network has converged and stop training.(maybe some steps perform so poor that some loss is huge?)