openai / improved-diffusion

Release for Improved Denoising Diffusion Probabilistic Models
MIT License
3.23k stars 485 forks source link

How to calculate NLL (which is mentioned in the paper) ? #33

Open Josh00-Lu opened 2 years ago

Josh00-Lu commented 2 years ago

Are there any explanations?

sndnyang commented 2 years ago

I found one related work https://github.com/baofff/Extended-Analytic-DPM the api of nll is https://github.com/baofff/Extended-Analytic-DPM/blob/main/interface/evaluators/dtdpm_evaluator.py#L78 Then it calls the function in https://github.com/baofff/Extended-Analytic-DPM/blob/main/core/diffusion/likelihood.py

But I'm not sure if its evaluation is the same as DDPM/iDDPM

sndnyang commented 2 years ago

Another one is https://github.com/openai/guided-diffusion/blob/main/guided_diffusion/gaussian_diffusion.py#L709

sndnyang commented 2 years ago

Another one is https://github.com/openai/guided-diffusion/blob/main/guided_diffusion/gaussian_diffusion.py#L709

It starts from guided_diffusion/scripts/image_nll.py

This is easy to follow and the result makes sense (close to the table reported)

Josh00-Lu commented 2 years ago

Another one is https://github.com/openai/guided-diffusion/blob/main/guided_diffusion/gaussian_diffusion.py#L709

It starts from guided_diffusion/scripts/image_nll.py

This is easy to follow and the result makes sense (close to the table reported)

Thanks a lot! It's a pity that the authors didn't explain in detail about how to implement the "NLL metric" in their paper. It's easy to get confused when directly follow others' PyTorch code.

As far as I know, to calculate NLL, we need to calculate the $\sumi log(P\theta (x{real}^{i}))$, while I don't know how to get $P\theta(x)$ (e.g. the PDF of the distribution of our model)? I'm new to this field.

Josh00-Lu commented 2 years ago

What if there is a "maths explanation" will be better! Whatever, thanks a lot!

sndnyang commented 1 year ago

What if there is a "maths explanation" will be better! Whatever, thanks a lot!

A math explanation is in DDPM http://arxiv.org/abs/2006.11239, section 3.3 Data scaling, reverse process decoder, and

YuanYuan98 commented 1 year ago

Another one is https://github.com/openai/guided-diffusion/blob/main/guided_diffusion/gaussian_diffusion.py#L709

It starts from guided_diffusion/scripts/image_nll.py

This is easy to follow and the result makes sense (close to the table reported)

Thanks for the useful pointer. The function _vb_terms_bpd calculates the NLL only for one step. If the NLL of one sample is required, should I sum up the results of _vb_terms_bpd for all diffusion steps (t=0-T)?

sndnyang commented 1 year ago

Another one is https://github.com/openai/guided-diffusion/blob/main/guided_diffusion/gaussian_diffusion.py#L709

It starts from guided_diffusion/scripts/image_nll.py This is easy to follow and the result makes sense (close to the table reported)

Thanks for the useful pointer. The function _vb_terms_bpd calculates the NLL only for one step. If the NLL of one sample is required, should I sum up the results of _vb_terms_bpd for all diffusion steps (t=0-T)?

Maybe you can check my wrapper code based on OpenAI's implementation https://github.com/sndnyang/iDDPM

luccachiang commented 1 year ago

Another one is https://github.com/openai/guided-diffusion/blob/main/guided_diffusion/gaussian_diffusion.py#L709

It starts from guided_diffusion/scripts/image_nll.py This is easy to follow and the result makes sense (close to the table reported)

Thanks for the useful pointer. The function _vb_terms_bpd calculates the NLL only for one step. If the NLL of one sample is required, should I sum up the results of _vb_terms_bpd for all diffusion steps (t=0-T)?

Maybe you can check my wrapper code based on OpenAI's implementation https://github.com/sndnyang/iDDPM

Thanks for your comprehensive guide! May I ask what is the relationship between nll and bpd? Do you have any explanations? Thx in advance.