orybkin / sigma-vae-tensorflow

A σ-VAE implementation in TensorFLow
Other
14 stars 1 forks source link

Code for computing the FID scores #6

Open ksachdeva opened 3 years ago

ksachdeva commented 3 years ago

Hi,

Thanks for your very interesting paper.

I looked at both of your implementations (torch & TF) and could not find where you have code related to computing the FID scores.

Would appreciate it if you could help point at it.

Regards & thanks Kapil

orybkin commented 3 years ago

Hi Kapil,

Thanks! I will try to clean up and add that code, but in the meantime, if you want you should be easily able to compute the FID score using any opensource implementation. I used this one: https://github.com/mseitzer/pytorch-fid

Oleh

ksachdeva commented 3 years ago

Thanks @orybkin. Much appreciated the quick response.

Regards Kapil

yahshibu commented 2 years ago

@orybkin Hello. Thank you for your interesting paper and helpful code!

I have a related question. What images did you use for computing the FID scores shown in your paper? Two groups are necessary to compute the scores. I assume that one of the two is the test set of a dataset (SVHN, CelebA, etc.). What is the other? The reconstructed images? Or, sampled images? If it is a group of sampled images, how many sampled images were used? I can't reproduce similar scores now.

I would appreciate it if you could help me.

Best regards.

orybkin commented 2 years ago

Hi Takashi,

I used the test set and the sampled images. I used the same number of sampled images as real images, but I subsampled the test set because it was taking a long time to evaluate on larger datasets. I think I used on the order of 100 images (i.e. 100 sampled and 100 real), but unfortunately I can't remember off the top of my head how many exactly, I will try to look this up

yahshibu commented 2 years ago

Hi Oleh,

Thank you for your help! I'm focusing on SVHN dataset now. When I use all the 26302 images in the test set and 26302 samples images, the number is around 50. When I prepare 100 images for each, the score gets about 130. When I use 1000 images for each, it's around 70. It means the fewer images are used, the larger the score gets.

I would appreciate it if you could recall the detailed protocol, but I'm glad now to know that you used sample images. Thank you.