cientgu / VQ-Diffusion

MIT License
432 stars 43 forks source link

Question about the q_posterior function #13

Closed mikittt closed 2 years ago

mikittt commented 2 years ago

Thank you for releasing the code of this excellent work!

Regarding the below function, I couldn't figure out the usage of the q_pred function in L215 and L237.

https://github.com/cientgu/VQ-Diffusion/blob/37bbcccdd4aef1794dac645128d864a9f69ed985/image_synthesis/modeling/transformers/diffusion_transformer.py#L206

I understand q_pred as a function that takes an initial state and a time, and returns a distribution with noise at that time. However, the q_pred function in L215 receives log_x_t instead of log_x_start, while the comment says it returns q(xt|x0). In addition, I would be grateful if you would tell me which equation q_pred in L237 corresponds to.

cientgu commented 2 years ago

Sorry for the late reply, really busy these days.... Line215 aims to get the transition matrix. Line 233 to Line 237 is a computation trick to achieve q_posterior, briefly speaking, it first do the normalization, then leverages the q_pred function, and finally re-normalizes to get the result.

mikittt commented 2 years ago

Thank you for the clarification!

cantabile-kwok commented 1 year ago

Sorry for the late reply, really busy these days.... Line215 aims to get the transition matrix. Line 233 to Line 237 is a computation trick to achieve q_posterior, briefly speaking, it first do the normalization, then leverages the q_pred function, and finally re-normalizes to get the result.

Sorry for re-opening this, but what did you mean by "Line215 aims to get the transition matrix."? I still don't get it 😥

fido20160817 commented 1 year ago

So, what's the specific theory behind the computation trick? It's not easy to understand the processing for me, can you give some clues behind this trick? Further, it should be log_qt = self.q_pred(log_x_start, t) in Line 125?