photosynthesis-team / piq

Measures and metrics for image2image tasks. PyTorch.
Apache License 2.0
1.32k stars 114 forks source link

Non-zero FID value for two exactly same distributions #283

Open tazwar22 opened 2 years ago

tazwar22 commented 2 years ago

Describe the bug Running the FID computation on two distributions which are exactly the same leads to non-zero values. For example, if I use the 10,000 examples of CIFAR-10 test set as one distribution and the same set again as the other distribution, I end up with a non-trivial value of ~7.x. Another example would be attainable by repeating the above, but with Normal(150,8) distributions (no particular reason for the parameters). The FID value in this case is once again non-trivial (==0.98). I have tested the cases with another implementation of FID in PyTorch (mseitzer) and have obtained values in the e-12 range (which makes more sense).

To Reproduce Steps to reproduce the behavior: (Normal Distribution example)

  1. dist1 = np.random.normal(20, 8.0, size=(10000, 32, 32, 3))
  2. dist2 = np.random.normal(20, 8.0, size=(10000, 32, 32, 3))
  3. piq.FID()(dist1, dist2)

Expected behavior The return values for such cases should be approximately zero. (See note about other PyTorch distribution above). With the mseitzer implementation, I get a value of -6.53e-13 for the Normal vs Normal case described above.

Additional context I noticed another issue #277 with the exact same behavior, but since it was closed, I wanted to highlight the discrepancy I noticed in this case.

zakajd commented 2 years ago

Thanks for raising bug report! I'll investigate this issue.

zakajd commented 2 years ago

@tazwar22 I tried to reproduce the behaviour and found out that both PIQ and mseitzer implementations are consistently predict big values for similar high-dimensional distributions.

# !pip install pytorch-fid

import piq
import torch
import numpy as np

# Code from github.com/mseitzer/pytorch-fid
from pytorch_fid.fid_score import calculate_frechet_distance

dist1_np = np.random.normal(150, 8.0, size=(100000, 500))
dist2_np = np.random.normal(150, 8.0, size=(100000, 500))

dist1_np_mu = np.mean(dist1_np, axis=0)
dist1_np_sigma = np.cov(dist1_np, rowvar=False)

dist2_np_mu = np.mean(dist2_np, axis=0)
dist2_np_sigma = np.cov(dist2_np, rowvar=False)

mseitzer_output = calculate_frechet_distance(dist1_np_mu, dist1_np_sigma, dist2_np_mu, dist2_np_sigma)
print(f'{mseitzer_output:0.4f}')

dist1_pt = torch.tensor(dist1_np)
dist2_pt = torch.tensor(dist2_np)
piq_output = piq.FID()(dist1_pt, dist2_pt)
print(piq_output)

>>> 81.0782
>>>tensor(81.0783, dtype=torch.float64)

You mentioned getting -6.53e-13 for the Normal vs Normal case which does happen when passing exact same values:

mseitzer_output_same = calculate_frechet_distance(dist1_np_mu, dist1_np_sigma, dist1_np_mu, dist1_np_sigma)
# print(f'{mseitzer_output_same:0.4f}')
print(mseitzer_output_same)

piq_output_same = piq.FID()(dist1_pt, dist1_pt)
print(piq_output_same)

>>>> -1.3096723705530167e-10
>>>> tensor(0.0002, dtype=torch.float64)

You can notice than in this case PIQ value is significantly larger in scale, but still very close to zero.

snk4tr commented 2 years ago

@zakajd please add a comment to the code of the FID metric with a description of the possibly counterintuitive behaviour.