Closed junhaobearxiong closed 5 months ago
Hi Bear,
Thanks for the interest in the work :) For the value on Table 1 I computed the ELBO but with all the constants included to get the value on the correct scale. The NLL term doesn't in general relate to the ELBO value itself even though it is on the correct scale. As a point of interest, in the absorbing state case, the ELBO is a weighted sum of NLL values, see the equation near the top of page 25 on my recent paper https://arxiv.org/pdf/2402.04997.pdf (it does say + const here but the const is actually 0, that's my bad). This doesn't hold in general though.
For the ELBO calculation on images in this repo I can try get the code for that and get back to you if you would like to use this for your own experiments.
Andrew
Hi Andrew,
Thank you so much for the response! If it's not too much trouble, it'd be great if you won't mind uploading the code for computing the ELBO. It would be very useful as a metric for experimenting with the framework.
Best, Bear
I have added the code to compute the ELBO and added a note in the readme on how to run it. Please do give it a go!
Thank you so much for providing the script Andrew!
Hi Andrew!
Thank you so much for this great work! We are trying to better understand the continuous time discrete diffusion framework and the evaluations reported in the paper. In Table 1, an ELBO of -3.59 is reported on the test set in bits per dimension for CIFAR10. However, as mentioned in #4, the
neg_elbo
on CIFAR is on the scale of 1e7 due to many dropped constants, and even after scaling by the image dimensions 32 x 32 x 3 is not at the right scale. Thenll
term does look to be on the right scale. We are wondering whether the reported ELBO in the paper is actually just thenll
term averaged over the test set (hence over all time steps), or there is an implementation of the actual ELBO somewhere?Thank you so much!
Best, Bear