question about eval - Githubissues

w86763777 / pytorch-ddpm

Unofficial PyTorch implementation of Denoising Diffusion Probabilistic Models

Do What The F*ck You Want To Public License

506 stars 62 forks source link

question about eval #17

Closed CrisZhouh closed 1 year ago

CrisZhouh commented 1 year ago

I want to evaluate the fid and is on your checkpoint ,but found sampling 50k images will spend whole day, so I reduced it to 500, but the score turn out to be worse(fid is about 70) than your record, so I wonder is it nums infulence the eval score? or might I made any mistake when run eval?

w86763777 commented 1 year ago

Obtaining a reliable FID score requires a significant number of samples to represent the image distribution. I suggest referring to issue #15, as it contains a discussion on evaluating FID scores.

CrisZhouh commented 1 year ago

thanks for your reply, I tried sampling 50k pics, but got FID score with nan,(both ema model) although it can present that the fid score is small enough to be nan, but I still want to get an exact score for experiment .could you please give me some sugesstions. thanks a lot

w86763777 commented 1 year ago

Could you please provide the following information:

Python version
PyTorch version
CUDA version of PyTorch
The steps to produce a NaN FID score

Additionally, I would like to clarify that the computation time for evaluating DDPM on CIFAR10 using an RTX 2080Ti is about 12 hours, rather than the several hours mentioned in issue #15.

CrisZhouh commented 1 year ago

thanks for your reply! my setting as follows:

Python 3.8.10
1.10.0.dev20210816+cu113
11.3
500000 I used 3090Ti to evaluated, speed is displayed as 82.80s/it. 500000 numbers to sample, batchsize is 128. I just evaluate once, no train step, evaluation include both model and emamodel, the total computation time is about 18hours

w86763777 commented 1 year ago

May I see the command you used to evaluate the pre-trained model?

CrisZhouh commented 1 year ago

the command is 'python main2.py main2.txt ', and I made a little modifications in codes, and logdir is lead to your pretraind model and fid_cache is download from right place . I checked the sampled image but found nothing strange. I would attached a link about the 'main2.txt' for 'main2.py' if you need thanks for your time!

w86763777 commented 1 year ago

I noticed that you have set fid_use_torch to True in main2.py. However, the torch backend for FID calculation may result in "nan" values due to unstable matrix square root implementation. To avoid this issue without compromising computation speed, you can set fid_use_torch to False.

CrisZhouh commented 1 year ago

that make sense , I would tried it tomorrow!

CrisZhouh commented 1 year ago

it acctually work! thanks a lot!