willisma / SiT

Official PyTorch Implementation of "SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers"
https://scalable-interpolant.github.io/
MIT License
662 stars 35 forks source link

The number of function evaluation (NFE) in ODE mode #21

Open LucasYFL opened 2 months ago

LucasYFL commented 2 months ago

Hi, I realized that there is no way to specify NFE in the code. The num-sampling-steps argument is passed to odeint to create the estimated path, per torchdiffeq: https://github.com/rtqichen/torchdiffeq/blob/cae73789b929d4dbe8ce955dace0089cf981c1a0/FAQ.md , which has no effect on how the ODE is solved.

xmhGit commented 2 months ago

Yes. I agree. When using ODE, it's also extremly fast, which could be considered as a very powerful method for fast sampling. Do you notice that in your implementation? Do you have the way to know the resulted NFE using ODE even if we can not specify the NFE for using ODE?

xmhGit commented 2 months ago

Hi, I realized that there is no way to specify NFE in the code. The num-sampling-steps argument is passed to odeint to create the estimated path, per torchdiffeq: https://github.com/rtqichen/torchdiffeq/blob/cae73789b929d4dbe8ce955dace0089cf981c1a0/FAQ.md , which has no effect on how the ODE is solved.

I'm guessing using non-adaptive solvers will make specifying the NFE work? But I haven't tested it.

xmhGit commented 2 months ago

By the way, when i using SDE solver, the output array fulls of nan value. In contrast, the results using ODE is reasonable.

LucasYFL commented 2 months ago

To know the exact NFE, you can add a counter to forward function. I guess using a fix grid solver could specify NFE. Or, a simple DDIM sampler could work as well. Still, I would love to see the author's reply.

xmhGit commented 2 months ago

To know the exact NFE, you can add a counter to forward function. I guess using a fix grid solver could specify NFE. Or, a simple DDIM sampler could work as well. Still, I would love to see the author's reply.

I agree. Seems like SiT didn't get too much attention compared with DiT. But SiT works better compared with DiT on my own data.

willisma commented 2 months ago

Thanks for the discussion. We used a self-implemented second-order Heun ODE solver for the numbers reported in the paper, which is fixed to be 125 steps (so 250 NFEs) exactly. We observed similar performance with a vanilla Euler (DDIM) sampler and a black-box dopri5.