Closed wiseaidev closed 3 years ago
I took a look at your colab notebook and your results.
However, the function you are using for resizing torchvision.transforms.functional.resize
is just a wrapper around the PIL library and cannot be used with pytorch tensors.
This function is not the PyTorch resizing operation that we study and commonly used (as in legacy-pytorch-fid).
If you replace your F with torch.nn.functional
the above method will not work.
Hey @GaParmar,
I've just taken a look at the implementation of the Bicubic interpolation in both OpenCV and PIL, and I've found out that OpenCV uses the Bicubic convolution algorithm which depends on a constant a
that can be set to either −0.5 or −0.75. In OpenCV, it is set to -0.75. However, in PIL, I didn't understand which algorithm is being used. All I've found is a constant called `BICUBIC` which is set to 3. Therefore, I'm assuming maybe it uses a different algorithm for the interpolation. Or maybe, somewhere in the code, the constant a
is set to -0.5.
Regardless of that, we can apply a preprocessing step(like blurring the image) before resizing the image as follow:
def resize_opencv(*, img, output_size):
#print(help(cv2.resize))
img = np.asarray(img, dtype=np.float64)
img = cv2.blur(img, ksize = (8, 8)) # preprocessing step
img = cv2.resize(img, dsize=(output_size, ) * 2, interpolation=cv2.INTER_CUBIC)
img = np.asarray(img, dtype=np.float64)
inspect_img(img=img)
return img
image_opencv = resize_opencv(img=image, output_size=output_size)
Which would produce a similar result to PIL:
It is just a rule of thumb and not a standard way of resizing, I think. And that's what I wanted to add. I hope you find this useful in a way or another. Peace out!
Thanks for these observations. The resizing ratio is an important factor when computing what blur kernel to apply. See the Fig 8 in our paper for a comparison of this.
I will close this issue. Feel free to re-open this if you have any additional questions
Recently, I've come across a post on LinkedIn that describes how we should carefully choose the right
resize
function while stressing the fact that using different libraries/frameworks leads to different results. So, I decided to test it myself. Click here to find the post that I took the inspiration from.The following is the code snippet that I've edited(using this colab notebook) to give the correct way of using resize methods in different frameworks.
which gives us the following results:
Therefore, TensorFlow, PyTorch, and PIL give similar results if the
resize
method is used properly like in the above snippet code.You can read my comments on linkedin to find out how I came to this solution.
The only remaining library is OpenCV which I'll test in the future.
Have a great day/night!