What should be the input data range

ragor114 / PyTorch-Frechet-Video-Distance

This repository contains an easy to use implementation of the Frechet Video Distance metric for PyTorch. The implementation is largely based on the StyleGAN-V repository but was modified to work with two Tensors representing sets of videos.

9 stars 0 forks source link

What should be the input data range #1

Open 18445864529 opened 6 months ago

18445864529 commented 6 months ago

0~255 or 0~1 or -1~1? Thanks!

ragor114 commented 6 months ago

Hi @18445864529 , thanks for your interest in my implementation. I believe the data range should be 0 ~ 1.

18445864529 commented 6 months ago

Thank you for the prompt reply. For my data, when I tried 0 ~ 1, the resulting value was 0.0149, -1 ~ 1 was 0.125, and 0 ~ 255 was billion. None of them seems correct. I also tried a=torch.rand(50,3,20,256,256); b=0.9*a; compute_fvd(a,b), and the result was also 0.0018, which does not seem right. What is the probable reason?

ragor114 commented 6 months ago

You are right this does not seem correct and does not correspond to the values I got with the implementation. I'm sorry currently I do not know how to fix the problem. If the problem persists I would recommend contacting the authors of the StyleGAN-V repository as my implementation is 90% taken from their source code and uses their pretrained model. If you discover a solution please inform me!

samueleruffino99 commented 4 months ago

Yes I am also getting relly low results for that!

samueleruffino99 commented 4 months ago

Simply load the model like this: def load_i3d(device='cuda'): """Load the I3D model from PyTorch Hub.""" detector_url = 'https://www.dropbox.com/s/ge9e5ujwgetktms/i3d_torchscript.pt?dl=1' detector_kwargs = dict(rescale=False, resize=False, return_features=True) # Return raw features before the softmax layer. with open_url(detector_url, verbose=False) as f: detector = torch.jit.load(f).eval().to(device) return detector See: https://github.com/universome/fvd-comparison/tree/master

ragor114 commented 3 months ago

@samueleruffino99 I tried to implement your fix, however, I get almost the exact same values using my model load and the function you proposed (and the function used in the original repository). You can find my implementation and some experiments in the fix-model-load branch. Maybe I did something wrong?

If you would open a Pull Request with a working fix, I would be happy to merge it!

ryushinn commented 1 week ago

Hi @ragor114 . In the original StyleGAN-V repo, doesn't this line indicate that the input range should be [0, 255] as uint8?

https://github.com/universome/stylegan-v/blob/3fecd69c602e1cda204357201461c0fb0a634909/src/metrics/metric_utils.py#L236