jabir-zheng / TCD

Official Repository of the paper "Trajectory Consistency Distillation"
https://mhh0318.github.io/tcd
318 stars 14 forks source link

Number of ODE Solver Steps in TCD training #11

Open ZacharyNovack opened 8 months ago

ZacharyNovack commented 8 months ago

Hi, awesome work! I had a question with regards to the distillation algorithm for TCD (Algorithm 1/2 in the paper, particularly w.r.t. Eq. 21). In the original LCM paper (to my understanding), the skipping step $k$ denotes the size of the single-step ODE solve used by the teacher model to solve from $t_{n+k}$ to $t_n$ (e.g. solving from 950 to 930 using a single step that is sized $k=20$). However, in the paragraph before equation 21 it is noted that $\Phi^{k}$ denotes $k$ "discretization steps" of a one-step ODE solver. Thus, my question is: do you use multiple calls of the ODE solver (with the teacher model) to solve to some timestep between $t_{n+k}$ and $t_m$ (e.g. solving the integral with 2 single-step ODE solves thus two calls to the teacher model), or are you still only using a single ODE solver call across that interval (similar to LCM)? If so, how many? Thank you!

mhh0318 commented 8 months ago

Hi, sorry for the late reply. Here we are using single call for $\Delta k$ . We've tried multiple settings for the choice of $k$. Finally we found $k=20$ or $50$ would be better.