I am currently working on diffusion for text generation as well. In your paper you have included the PPL of DiffusionLM in your results for comparison. I would I assume you have derived this from the ELBO from the loss of the model, right? Would you please share more details of the computation? For example, what loss you are using and whether you have estimate the token level PPL or the sequence level PPL. It would be great if you can share the code for this part as well.
Thank you very much. Your help is appreciated as we would like to cite this method.
Greetings,
I am currently working on diffusion for text generation as well. In your paper you have included the PPL of DiffusionLM in your results for comparison. I would I assume you have derived this from the ELBO from the loss of the model, right? Would you please share more details of the computation? For example, what loss you are using and whether you have estimate the token level PPL or the sequence level PPL. It would be great if you can share the code for this part as well.
Thank you very much. Your help is appreciated as we would like to cite this method.