sony / ctm

227 stars 12 forks source link

Some questions regarding the implementation details of CTM #3

Closed ArmeriaWang closed 7 months ago

ArmeriaWang commented 8 months ago

Thank you for your outstanding work! I have two questions regarding the implementation details of CTM and would appreciate your insights:

  1. Calculation of $\texttt{Solver}(x_t, t, u; \phi)$. For models like EDM, the timesteps in sampling are in a discrete manner, as shown in equation (5) in the EDM paper [1]. It seems challenging to fit $t$ and $u$ into EDM's sampling function if they are sampled in a continuous domain. I'd like to understand how to fill this gap, or if it's more appropriate to directly use a discrete timestep sampler for the $\texttt{Solver}$.

  2. Incorporation of a GAN discriminator. We have conducted some toy CTM experiments and found that sometimes the discriminator converges too quickly, leaving the generator without gradients. Could you share some experiences or advice on balancing the GAN with other loss components?

BTW, looking forward to your code release!

[1] Karras et al. Elucidating the Design Space of Diffusion-Based Generative Models. 2022

ChiehHsinJesseLai commented 7 months ago

Hello,

Please kindly check our released codes!