pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
15.95k stars 6.91k forks source link

Discrepancy in output of torchvision.io.read_image vs PIL.Image #8088

Open 8uurg opened 10 months ago

8uurg commented 10 months ago

🐛 Describe the bug

Some images from the imagenetv2 dataset (downloadable here) contain nonzero differences when loaded using torchvision.io.read_image, with some images containing large differences in pixel values.

import torch
import torchvision.io
import numpy as np
from PIL import Image

def loadimage_pil(path):
    return torch.tensor(np.array(Image.open(path).convert("RGB"))).permute(2, 0, 1)

def loadimage_torchio(path):
    return torchvision.io.read_image(path, torchvision.io.ImageReadMode.RGB)

# assuming archive is unpacked in the same folder as script - change accordingly.
filepath = "./imagenetv2-matched-frequency-format-val/455/aaaf43c110a10aabce09700a6a3cfb2622b4847a.jpeg"
print(f"loading '{filepath}'")
img_pil = loadimage_pil(filepath)
img_tio = loadimage_torchio(filepath)
difference = img_pil.to(float) - img_tio.to(float)

error = torch.sqrt(torch.mean(torch.square(difference)))
print(error)
# > tensor(6.3985, dtype=torch.float64)

When loading the file used in the example imagenetv2-matched-frequency-format-val/455/aaaf43c110a10aabce09700a6a3cfb2622b4847a.jpeg the printed error value is 6.3985.

Versions

Collecting environment information...
PyTorch version: 2.1.0
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A

OS: Fedora Linux 36 (Thirty Six) (x86_64)
GCC version: (GCC) 12.2.1 20220819 (Red Hat 12.2.1-2)
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35

Python version: 3.10.8 | packaged by conda-forge | (main, Nov 22 2022, 08:26:04) [GCC 10.4.0] (64-bit runtime)
Python platform: Linux-6.0.8-200.fc36.x86_64-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: 11.7.99
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: 
<snip>

Nvidia driver version: 520.56.06
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] mypy==1.2.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.24.4
[pip3] torch==2.1.0
[pip3] torch-tb-profiler==0.4.3
[pip3] torchaudio==2.1.0
[pip3] torchinfo==1.8.0
[pip3] torchvision==0.16.0
[pip3] triton==2.1.0
[conda] blas                      1.0                         mkl    conda-forge
[conda] libblas                   3.9.0            16_linux64_mkl    conda-forge
[conda] libcblas                  3.9.0            16_linux64_mkl    conda-forge
[conda] libjpeg-turbo             2.0.0                h9bf148f_0    pytorch
[conda] liblapack                 3.9.0            16_linux64_mkl    conda-forge
[conda] liblapacke                3.9.0            16_linux64_mkl    conda-forge
[conda] mkl                       2022.1.0           h84fe81f_915    conda-forge
[conda] numpy                     1.24.4                   pypi_0    pypi
[conda] pytorch                   2.1.0           py3.10_cuda11.8_cudnn8.7.0_0    pytorch
[conda] pytorch-cuda              11.8                 h7e8668a_5    pytorch
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] torch-tb-profiler         0.4.3                    pypi_0    pypi
[conda] torchaudio                2.1.0               py310_cu118    pytorch
[conda] torchinfo                 1.8.0              pyhd8ed1ab_0    conda-forge
[conda] torchtriton               2.1.0                     py310    pytorch
[conda] torchvision               0.16.0              py310_cu118    pytorch
NicolasHug commented 10 months ago

Hi @8uurg , I guess these come from diverging version of libjpeg[-turbo].

Could you please share the PIL version, as well as the output of ldd _imaging.so (from https://stackoverflow.com/a/24397115)

Could you also please share the output of

import torch
import torchvision

print(f"{torch.ops.image._jpeg_version() = }")
print(f"{torch.ops.image._is_compiled_against_turbo() = }")
8uurg commented 10 months ago

Hi @8uurg , I guess these come from diverging version of libjpeg[-turbo].

Could you please share the PIL version, as well as the output of ldd _imaging.so (from https://stackoverflow.com/a/24397115)


PIL.__version__ = '9.3.0'

ldd /lib/python3.10/site-packages/PIL/_imaging.cpython-310-x86_64-linux-gnu.so linux-vdso.so.1 (0x00007fff6d3eb000) libjpeg.so.9 => /lib/python3.10/site-packages/PIL/../../../libjpeg.so.9 (0x00007f1b34cad000) libz.so.1 => /lib/python3.10/site-packages/PIL/../../../libz.so.1 (0x00007f1b34c93000) libtiff.so.5 => /lib/python3.10/site-packages/PIL/../../../libtiff.so.5 (0x00007f1b34c06000) libc.so.6 => /lib64/libc.so.6 (0x00007f1b34800000) libwebp.so.7 => /lib/python3.10/site-packages/PIL/../../.././libwebp.so.7 (0x00007f1b34b71000) libzstd.so.1 => /lib/python3.10/site-packages/PIL/../../.././libzstd.so.1 (0x00007f1b34a9f000) liblzma.so.5 => /lib/python3.10/site-packages/PIL/../../.././liblzma.so.5 (0x00007f1b34a76000) libLerc.so => /lib/python3.10/site-packages/PIL/../../.././libLerc.so (0x00007f1b34764000) libdeflate.so.0 => /lib/python3.10/site-packages/PIL/../../.././libdeflate.so.0 (0x00007f1b34a66000) libm.so.6 => /lib64/libm.so.6 (0x00007f1b34686000) /lib64/ld-linux-x86-64.so.2 (0x00007f1b34d6d000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f1b34a5f000) libstdc++.so.6 => /lib/python3.10/site-packages/PIL/../../../././libstdc++.so.6 (0x00007f1b344d2000) libgcc_s.so.1 => /lib/python3.10/site-packages/PIL/../../../././libgcc_s.so.1 (0x00007f1b34a46000)

> 
> Could you also please share the output of
> 
> ```
> import torch
> import torchvision
> 
> print(f"{torch.ops.image._jpeg_version() = }")
> print(f"{torch.ops.image._is_compiled_against_turbo() = }")
> ```

torch.ops.image._jpeg_version() = 80 torch.ops.image._is_compiled_against_turbo() = True

NicolasHug commented 10 months ago

Thanks for the output

    libjpeg.so.9 => <path-to-virtualenv>/lib/python3.10/site-packages/PIL/../../../libjpeg.so.9 (0x00007f1b34cad000)

I think that's it: PIL is relying on libjpeg while torchvision is relying on libjpeg-turbo. They're both jpeg-compliant and, from past experiments, models aren't sensitive to these decoding differences. I think that if you installed PIL 10, you'd get turbo for PIL as well, and have results that are closer to the torchvision ones.

8uurg commented 10 months ago

Thanks for the information!

I was investigating this because there was a small change in the validation performance of a model after changing how the images were loaded. The difference was not too big (single sample got changed, I think), but I was expecting the result to be identical. For those following in my footsteps: when installed through conda in my environment, Pillow 10 doesn't seem to link libjpeg-turbo.

When I install Pillow via pip, I can confirm that it is indeed a difference between libjpeg and libjpeg-turbo