Issue of FID and IS score

priyankaupadhyay090 commented 2 years ago

@wtliao

I have calculated the scores:

FID score but I am getting very high value : FID: 73.33472569962976
IS score is quite low : Inception mean: 4.732609 , Inception std: 0.1345223

For now, I am generating the images from epoch 550

Is the best FID score is calculated from last epoch 550 ? or Should I calculate the IS and FID score of the checkpoints every 10 or 50 epochs (or let me know what epoch I should use) and then choose the checkpoints with best FID and IS score ?

1hexf1 commented 2 years ago

@wtliao I had the same problem, testing a trained model. IS: 4.81, FID: 24.89 There is still a certain gap from the data in the paper. In CUB dataset

priyankaupadhyay090 commented 2 years ago

@wtliao

I have calculated the scores:

FID score but I am getting very high value : FID: 73.33472569962976

IS score is quite low : Inception mean: 4.732609 , Inception std: 0.1345223

For now, I am generating the images from epoch 550

Is the best FID score is calculated from last epoch 550 ? or Should I calculate the IS and FID score of the checkpoints every 10 or 50 epochs (or let me know what epoch I should use) and then choose the checkpoints with best FID and IS score ?

@wtliao I had the same problem, testing a trained model. IS: 4.81, FID: 24.89 There is still a certain gap from the data in the paper. In CUB dataset

hey I have the same issue. from this link (https://github.com/bioinf-jku/TTUR), I used the fid.py file and used

fid.py /path/to/images /path/to/other_images

where /path/to/images (real images path)--> data/birds/test_image (ground truth test image path ) /path/to/other_images (generated images path) ---> models/netG_550 (generated images path)

Can I ask which path you gave for "real_image"

can you also share the command which you use to execute fid.py?

@wtliao

for /path/to/images (real_images_bird path) -- > this path would be test_images path or all_real_images path ???

all_real_image_path = data/birds/CUB_200_2011/images (11788 image) test_images_path = data/birds/test_images (2933 image)

The given bird dataset/preprocessed bird dataset does not have explicitly train images and test images folder (we just have pickle files for them). I created my own split according to given class for train (8855 images from 150 classes) and test (2933 images from 50 classes).

and then I used test_images(real images) and generated_images path to get the FID score.

It would be great to know which real_images path you used to generate FID.

wtliao commented 2 years ago

@wtliao I have calculated the scores:

FID score but I am getting very high value : FID: 73.33472569962976

IS score is quite low : Inception mean: 4.732609 , Inception std: 0.1345223

For now, I am generating the images from epoch 550 Is the best FID score is calculated from last epoch 550 ? or Should I calculate the IS and FID score of the checkpoints every 10 or 50 epochs (or let me know what epoch I should use) and then choose the checkpoints with best FID and IS score ?

@wtliao I had the same problem, testing a trained model. IS: 4.81, FID: 24.89 There is still a certain gap from the data in the paper. In CUB dataset

hey I have the same issue. from this link (https://github.com/bioinf-jku/TTUR), I used the fid.py file and used

fid.py /path/to/images /path/to/other_images

where /path/to/images (real images path)--> data/birds/test_image (ground truth test image path ) /path/to/other_images (generated images path) ---> models/netG_550 (generated images path)

Can I ask which path you gave for "real_image"

can you also share the command which you use to execute fid.py?

@wtliao

for /path/to/images (real_images_bird path) -- > this path would be test_images path or all_real_images path ???

all_real_image_path = data/birds/CUB_200_2011/images (11788 image) test_images_path = data/birds/test_images (2933 image)

The given bird dataset/preprocessed bird dataset does not have explicitly train images and test images folder (we just have pickle files for them). I created my own split according to given class for train (8855 images from 150 classes) and test (2933 images from 50 classes).

and then I used test_images(real images) and generated_images path to get the FID score.

It would be great to know which real_images path you used to generate FID.

Hi @priyankaupadhyay090 ,

according to https://github.com/bioinf-jku/TTUR, the fid.py is called by fid.py /path/to/images /path/to/other_images, it does not matter which one is generated and which is the real ones.

If I get your question correctly, do you mean should you use the all_real_image_path or test_images_path? Then, use the test_images_path if your generated images are generated from the test set, otherwise the all_real_image_path. But we evaluate the performance of a model based on the test set.

ourpubliccodes commented 1 year ago

The test set for the CUB dataset has only 2933 images, and the official FID calculation method requires at least 10k generated images for the validity of the FID values. How many images did you generate to evaluate the FID values? And how is the number of corresponding ground-truth images（real images） handled?

wtliao / text2image

Issue of FID and IS score #14