RIN results on CIFAR - Githubissues

google-research / pix2seq

Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)

Apache License 2.0

857 stars 71 forks source link

RIN results on CIFAR #35

Closed nicolas-dufour closed 1 year ago

nicolas-dufour commented 1 year ago

Hi,

In the RIN paper, you mention that you reach 1.81 FID and i'm trying to reproduce this results without success. I've reached at a minimum FID score of 17.7

Looking at the config files, 2 train schedulers are used, linear and sigmoid. Which one was used to get this result? Is the inference framework the same as the rest of the results with 250 DDIM steps and a cosine scheduler with tau=1? Are those results obtained on the class conditional setup or the unconditional one?

Also, if you have them, do you have an IS associated to this model?

Thank you very much for the information!

ajabri commented 1 year ago

Hi,

The 1.81 FID result was obtained with a train schedule of sigmoid@-3,3,0.9 (cosine should work fine on cifar-10 as well) and 1000 steps of DDPM (as with other results) with cosine schedule tau=1. Sampling is class-conditional (with no guidance) and the inception score is 10.3.

nicolas-dufour commented 1 year ago

Hi @ajabri , I've been trying to reproduce the results with Pytorch but i only manage to reproduce the inception score (10.7 on my side). However, for the FID score i only manage to get 13.9 FID. Do you know if something is missing in the released implementation? Thanks!

chentingpc commented 1 year ago

the full config was released in https://github.com/google-research/pix2seq/blob/main/configs/config_diffusion_cifar10.py. one thing important in cifar10 was dropout otherwise it overfits quite seriously. also, 10.7 IS with 13.9 FID seems suspicious to me, never encountered such low fid with this good IS score.

nicolas-dufour commented 1 year ago

@chentingpc Hum thanks for the insights but I have thoroughly used the same config that you are referencing above. There may be a problem with my evaluation setup then. I'm using torch-fidelity for FID computation. Would you provide a set of generated images so I can see if my FID computation is to blame for the differences? Thank you, Nicolas Dufour

chentingpc commented 1 year ago

here are some random samples: individualImage

nicolas-dufour commented 1 year ago

Managed to reproduce thanks!