Open tazwar22 opened 2 years ago
Thanks for raising bug report! I'll investigate this issue.
@tazwar22 I tried to reproduce the behaviour and found out that both PIQ and mseitzer implementations are consistently predict big values for similar high-dimensional distributions.
# !pip install pytorch-fid
import piq
import torch
import numpy as np
# Code from github.com/mseitzer/pytorch-fid
from pytorch_fid.fid_score import calculate_frechet_distance
dist1_np = np.random.normal(150, 8.0, size=(100000, 500))
dist2_np = np.random.normal(150, 8.0, size=(100000, 500))
dist1_np_mu = np.mean(dist1_np, axis=0)
dist1_np_sigma = np.cov(dist1_np, rowvar=False)
dist2_np_mu = np.mean(dist2_np, axis=0)
dist2_np_sigma = np.cov(dist2_np, rowvar=False)
mseitzer_output = calculate_frechet_distance(dist1_np_mu, dist1_np_sigma, dist2_np_mu, dist2_np_sigma)
print(f'{mseitzer_output:0.4f}')
dist1_pt = torch.tensor(dist1_np)
dist2_pt = torch.tensor(dist2_np)
piq_output = piq.FID()(dist1_pt, dist2_pt)
print(piq_output)
>>> 81.0782
>>>tensor(81.0783, dtype=torch.float64)
You mentioned getting -6.53e-13
for the Normal vs Normal case which does happen when passing exact same values:
mseitzer_output_same = calculate_frechet_distance(dist1_np_mu, dist1_np_sigma, dist1_np_mu, dist1_np_sigma)
# print(f'{mseitzer_output_same:0.4f}')
print(mseitzer_output_same)
piq_output_same = piq.FID()(dist1_pt, dist1_pt)
print(piq_output_same)
>>>> -1.3096723705530167e-10
>>>> tensor(0.0002, dtype=torch.float64)
You can notice than in this case PIQ value is significantly larger in scale, but still very close to zero.
@zakajd please add a comment to the code of the FID metric with a description of the possibly counterintuitive behaviour.
Describe the bug Running the FID computation on two distributions which are exactly the same leads to non-zero values. For example, if I use the 10,000 examples of CIFAR-10 test set as one distribution and the same set again as the other distribution, I end up with a non-trivial value of ~7.x. Another example would be attainable by repeating the above, but with Normal(150,8) distributions (no particular reason for the parameters). The FID value in this case is once again non-trivial (==0.98). I have tested the cases with another implementation of FID in PyTorch (mseitzer) and have obtained values in the
e-12
range (which makes more sense).To Reproduce Steps to reproduce the behavior: (Normal Distribution example)
dist1 = np.random.normal(20, 8.0, size=(10000, 32, 32, 3))
dist2 = np.random.normal(20, 8.0, size=(10000, 32, 32, 3))
piq.FID()(dist1, dist2)
Expected behavior The return values for such cases should be approximately zero. (See note about other PyTorch distribution above). With the mseitzer implementation, I get a value of
-6.53e-13
for the Normal vs Normal case described above.Additional context I noticed another issue #277 with the exact same behavior, but since it was closed, I wanted to highlight the discrepancy I noticed in this case.