mseitzer / pytorch-fid

Compute FID scores with PyTorch.
Apache License 2.0
3.34k stars 506 forks source link

If the size of the input image is different from the set value, what should be done #70

Closed activate-an closed 2 years ago

activate-an commented 3 years ago

Since I want to change the size of the network input image, for example, the input is a rectangular image, do I need to retrain the parameters? If so, how to achieve this? If not, should we just change 229229 to the required size?just like 256176。Hope someone can help to answer, thank you very much

mseitzer commented 3 years ago

In principle, you could resize your images to the desired size, as long as you do not change the aspect ratio with this resizing. You would get a FID score, but this score would not be comparable to FID scores computed with 299x299.

If you want to compare against other methods, you need to use 1:1 aspect ratio. You could take the central image crop for this.

activate-an commented 3 years ago

@mseitzer
Since both the trained image and the original image are rectangular, that is, the aspect ratio is not 1:1, in this case, if the network structure needs to be reset for training, such as changing the size of the convolution kernel, how to train its parameters?

On the other hand, if there is no need to change the network structure, that is, if I change the image from rectangle to aspect ratio of 1:1, since this will change other information in the image, such as the position information, whether it will have an impact on the final result? If you have any opinions on this, please let me know.

mseitzer commented 3 years ago

You don't need to retrain the network. You should change the aspect ratio, as the networks were trained on valid aspect ratios.

You can however input your rectangular images directly into the network (without resizing to a different aspect ratio), and you would get a FID score out. But I repeat from above: this score would not be comparable to FID scores computed with 299x299. You can only use it to compare your own models with each other on your target dataset.

To do so, you have to disable automatic resizing by changing https://github.com/mseitzer/pytorch-fid/blob/d042ab8a9f8e4b388c21bc7b38d9599c5fbcfe7b/src/pytorch_fid/fid_score.py#L252 to

model = InceptionV3([block_idx], resize_input=False).to(device)

If all your images have the same size, then this will work. If they have different sizes, you also need to change the batch size to 1 with the argument --batch-size 1. This will make the evaluation super slow of course.

Overall, it is easier to take a square central crop of your image.

activate-an commented 3 years ago

@mseitzer Do you mean that I can directly input my rectangular image into the network for testing, just by changing the model to "disable automatic resize"? The disadvantage is that I cannot compare the FID scores obtained in this way with the FID scores computed with 299x299. . I can only compare them against my own data set?

Thanks a lot for your answers.

mseitzer commented 3 years ago

Yes, this is what I was trying to say. And yes, you can most likely not compare the values to FID scores with 299x299. FID is known to be sensitive to variations in how the input is represented, so it would give you different results there.

ygean commented 2 weeks ago

The design is just too strange when you set resize_input to False in InceptionV3, assuming that the image sizes in your dataset are not consistent. This leads to the fact that every developer who uses this package needs to consider writing a script to resize the images in the folder to a certain size, which is not a design idea that is ready to use out of the box.

Later, I will submit a PR with the following idea:

In most cases, users' images may have different sizes, but in order to make the dataloader usable for batch loading, the default image transform is to use center crop.

In this way, no one will be forced to use batch size = 1 to slow down the validation.