pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.11k stars 6.94k forks source link

`image` extension can't be loaded in fresh conda env with torchvision 0.13.1 #6453

Open fmassa opened 2 years ago

fmassa commented 2 years ago

🐛 Describe the bug

Hi,

Quick FYI that I'm getting this warning when installing torch and torchvision from a fresh conda env

/private/home/fmassa/.conda/envs/fairvit2/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot
open shared object file: No such file or directory
  warn(f"Failed to load image Python extension: {e}")

This can be reproduced by doing

create -f conda.yaml

with the follwoing conda.yaml

name: bug_repro
channels:
  - defaults
  - pytorch
  - conda-forge
dependencies:
  - python=3.9
  - pytorch=1.12.1
  - torchvision=0.13.1
  - cudatoolkit=11.3
  - nvidia-apex=0.1
  - timm
  - pip

Versions

Python 3.9 PyTorch 1.12.1 TorchVision 0.13.1 CUDA 11.3

NicolasHug commented 2 years ago

Thanks @fmassa

I can reproduce the error. I also tried the same conda.yaml file and using

  - pytorch=1.12.0
  - torchvision=0.13.0

and I'm getting a different error:

(bug_repro2) ➜  ~ python -c "import torchvision; from torchvision.io import read_image"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/fsx/users/nicolashug/conda/envs/bug_repro2/lib/python3.9/site-packages/torchvision/__init__.py", line 4, in <module>
    import torch
  File "/fsx/users/nicolashug/conda/envs/bug_repro2/lib/python3.9/site-packages/torch/__init__.py", line 202, in <module>
    from torch._C import *  # noqa: F403
ImportError: libcupti.so.11.2: cannot open shared object file: No such file or directory

However, the problem seem to be related to interactions with nvidia-apex. Using the following file (removing nvidia-apex=0.1), I'm not observing any error:

name: bug_repro_without_apex
channels:
  - defaults
  - pytorch
  - conda-forge
dependencies:
  - python=3.9
  - pytorch=1.12.1
  - torchvision=0.13.1
  - cudatoolkit=11.3
  - timm
  - pip
NicolasHug commented 2 years ago

FYI @atalman , it looks like installing torchvision with apex leads to problems discovering libtorch_cuda_cu. Not sure if this is a problem with us or with apex though?