Open 8uurg opened 10 months ago
Hi @8uurg , I guess these come from diverging version of libjpeg[-turbo].
Could you please share the PIL version, as well as the output of ldd _imaging.so
(from https://stackoverflow.com/a/24397115)
Could you also please share the output of
import torch
import torchvision
print(f"{torch.ops.image._jpeg_version() = }")
print(f"{torch.ops.image._is_compiled_against_turbo() = }")
Hi @8uurg , I guess these come from diverging version of libjpeg[-turbo].
Could you please share the PIL version, as well as the output of
ldd _imaging.so
(from https://stackoverflow.com/a/24397115)PIL.__version__ = '9.3.0'
ldd
>
> Could you also please share the output of
>
> ```
> import torch
> import torchvision
>
> print(f"{torch.ops.image._jpeg_version() = }")
> print(f"{torch.ops.image._is_compiled_against_turbo() = }")
> ```
torch.ops.image._jpeg_version() = 80 torch.ops.image._is_compiled_against_turbo() = True
Thanks for the output
libjpeg.so.9 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../../libjpeg.so.9 (0x00007f1b34cad000)
I think that's it: PIL is relying on libjpeg
while torchvision is relying on libjpeg-turbo
. They're both jpeg-compliant and, from past experiments, models aren't sensitive to these decoding differences. I think that if you installed PIL 10, you'd get turbo for PIL as well, and have results that are closer to the torchvision ones.
Thanks for the information!
I was investigating this because there was a small change in the validation performance of a model after changing how the images were loaded. The difference was not too big (single sample got changed, I think), but I was expecting the result to be identical. For those following in my footsteps: when installed through conda in my environment, Pillow 10 doesn't seem to link libjpeg-turbo.
When I install Pillow via pip, I can confirm that it is indeed a difference between libjpeg
and libjpeg-turbo
🐛 Describe the bug
Some images from the imagenetv2 dataset (downloadable here) contain nonzero differences when loaded using
torchvision.io.read_image
, with some images containing large differences in pixel values.When loading the file used in the example
imagenetv2-matched-frequency-format-val/455/aaaf43c110a10aabce09700a6a3cfb2622b4847a.jpeg
the printed error value is 6.3985.Versions