Closed KKN18 closed 2 weeks ago
I guess it's because the training framework of SVD is not ddpm-based. In ddpm the forward pass is as you mentioned. SVD adopt EDM framework (https://arxiv.org/pdf/2206.00364.pdf). It's forward process is different from ddpm's. But I'm also do not familiar woth EDM so it's just a simple guess.
Hi,
While exploring diffusion models, I noticed the standard forward pass often uses the formula $\alpha \cdot x + \sigma \cdot \epsilon$. However, in your video diffusion model code, I saw a different approach:
You're sampling noise levels from a log-normal distribution and I'm curious about the reasoning behind this choice. If there are any papers or references that guided this decision, could you share them?
Thanks for your insights!