Reproducing NCSN++ cont. (deep, VE)

yang-song / score_sde_pytorch

PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)

https://arxiv.org/abs/2011.13456

Apache License 2.0

1.58k stars 295 forks source link

Reproducing NCSN++ cont. (deep, VE) #18

Open paulhuangkm opened 2 years ago

paulhuangkm commented 2 years ago

Hi,

I am currently trying to reproduce results of NCSN++ cont. (deep, VE) on CIFAR-10 using this code. However, both times running the code resulted in FID around 2.60, where it is supposed to be around 2.20. May I ask what could be the problem here? Also, I noticed the seed in config files, but couldn't find anything else that contains related stuff, is the seed used in this repo?

Regards, Paul

yang-song commented 2 years ago

Thanks for reporting. I remembered testing it before releasing this code repo and got FID 2.20. Could you try the non-deep version and see if you can obtain an FID similar to the one reported in paper? The seed in config files is probably not used anywhere—it's from the JAX repo.

paulhuangkm commented 2 years ago

I'm currently running other experiments and will try the non-deep version in 1~2 weeks. Also, I ran sampling with the checkpoint provided in this repo and it has FID 2.20, so I'm thinking the seed might caused the difference in training process.

paulhuangkm commented 2 years ago

Hi @yang-song,

I've tried the non-deep version (configs/ve/cifar10_ncsnpp_continuous.py) and the result is

I0620 20:25:37.822132 140001873700672 run_lib.py:402] ckpt-24 --- inception_score: 9.829116e+00, FID: 2.684183e+00, KID: 6.818869e-04

I trained for the same amount of steps and used the same batch size, but its a bit worse than the result from the paper.

yang-song commented 2 years ago

Thanks for reporting. I will need some time to run those programs and compare the implementation with our internal code at Google. In the meanwhile, one potential reason is different random seeds causing the optimal checkpoint number (the one that minimizes the FID score) to be different.