facebookresearch / pytorch3d

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
https://pytorch3d.org/
Other
8.81k stars 1.32k forks source link

Any Support for CUDA version (12.1)? #1518

Open Larescool opened 1 year ago

Larescool commented 1 year ago

When setup pytorch3d-0.7.3, I met up with this:

The detected CUDA version (12.1) mismatches the version that was used to compile PyTorch (11.8). Please make sure to use the same CUDA versions.

Is there any solutions for newest CUDA version (12.1) ?

bottler commented 1 year ago

Can you explain more about your set up? There is no release of pytorch with cuda 12.1. Did you build pytorch from source? If so, then if you build pytorch3d from source in the same environment then things should work.

YodaEmbedding commented 1 year ago

On Arch Linux, the latest packages are:

...Although, I personally use a non-system install of torch inside a virtual environment, which seems to work with my OS without any further changes. pip install torch seems to download the wheel torch-2.0.0-cp311-cp311-manylinux1_x86_64.whl, and everything seems to work without rebuilding specifically for CUDA 12.1. No error when running PyTorch code. Perhaps this is because torch installs the dependency nvidia-cuda-runtime-cu11. This suggests that the system CUDA runtime isn't being used here.

YodaEmbedding commented 1 year ago

Installing PyTorch 3D via git still leads to this error:

$ pip install "git+https://github.com/facebookresearch/pytorch3d.git"

Building wheels for collected packages: pytorch3d
  Building wheel for pytorch3d (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [290 lines of output]
      /tmp/pip-req-build-ow6c0gf_/setup.py:84: UserWarning: The environment variable `CUB_HOME` was not found. NVIDIA CUB is required for compilation and can be downloaded from `https://github.com/NVIDIA/cub/releases`. You can unpack it to a location of your choice and set the environment variable `CUB_HOME` to the folder containing the `CMakeListst.txt` file.
        warnings.warn(
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build/lib.linux-x86_64-cpython-311
      creating build/lib.linux-x86_64-cpython-311/pytorch3d
      [...]
      copying pytorch3d/datasets/shapenet/shapenet_synset_dict_v1.json -> build/lib.linux-x86_64-cpython-311/pytorch3d/datasets/shapenet
      copying pytorch3d/datasets/r2n2/r2n2_synset_dict.json -> build/lib.linux-x86_64-cpython-311/pytorch3d/datasets/r2n2
      running build_ext
      Traceback (most recent call last):
        [...]
        File "/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 499, in build_extensions
          _check_cuda_version(compiler_name, compiler_version)
        File "/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/utils/cpp_extension.py", line 386, in _check_cuda_version
          raise RuntimeError(CUDA_MISMATCH_MESSAGE.format(cuda_str_version, torch.version.cuda))
      RuntimeError:
      The detected CUDA version (12.1) mismatches the version that was used to compile
      PyTorch (11.7). Please make sure to use the same CUDA versions.

From what I can tell, this is because of PyTorch being happy with the non-system CUDA 11.7 runtime but unhappy with CUDA 12.1 being used for compilation. Thus, it's not really PyTorch 3D's fault. Possible workarounds:

YodaEmbedding commented 1 year ago

Using the system site-packages version of PyTorch 2.0.0 with CUDA 12.1, I get other errors when building PyTorch 3D from git:

$ pip install "git+https://github.com/facebookresearch/pytorch3d.git"
      [...]
      running build_ext
      /home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/utils/cpp_extension.py:398: UserWarning: There are no g++ version bounds defined for CUDA version 12.1
        warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
      building 'pytorch3d._C' extension
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d/csrc
      creating /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query
      [...]
      Emitting ninja build file /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/build.ninja...
      Compiling objects...
      Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
      [1/67] /opt/cuda/bin/nvcc  -DWITH_CUDA -DTHRUST_IGNORE_CUB_VERSION_CHECK -I/tmp/pip-req-build-06jphomw/pytorch3d/csrc -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/TH -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/include -I/usr/include/python3.11 -c -c /tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query/ball_query.cu -o /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query/ball_query.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -std=c++14 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61
      FAILED: /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query/ball_query.o
      /opt/cuda/bin/nvcc  -DWITH_CUDA -DTHRUST_IGNORE_CUB_VERSION_CHECK -I/tmp/pip-req-build-06jphomw/pytorch3d/csrc -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/TH -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/lib/python3.11/site-packages/torch/include/THC -I/opt/cuda/include -I/home/mulhaq/.cache/pypoetry/virtualenvs/compressai-trainer-KZXCvdxM-py3.11/include -I/usr/include/python3.11 -c -c /tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query/ball_query.cu -o /tmp/pip-req-build-06jphomw/build/temp.linux-x86_64-cpython-311/tmp/pip-req-build-06jphomw/pytorch3d/csrc/ball_query/ball_query.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ -std=c++14 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61
      /usr/include/stdlib.h(141): error: identifier "_Float32" is undefined
        extern _Float32 strtof32 (const char *__restrict __nptr,
               ^

      /usr/include/stdlib.h(147): error: identifier "_Float64" is undefined
        extern _Float64 strtof64 (const char *__restrict __nptr,
               ^

      /usr/include/stdlib.h(153): error: identifier "_Float128" is undefined
        extern _Float128 strtof128 (const char *__restrict __nptr,
               ^

That's a bunch of missing types, which suggests the warning may be relevant.

UserWarning: There are no g++ version bounds defined for CUDA version 12.1
$ g++ --version | head -n 1
g++ (GCC) 13.1.1 20230429

But the max g++ version for CUDA 12.0 is g++ 12.1. (Not sure about CUDA 12.1.) So presumably, a downgrade of g++ may help...

Luckily, I have an older Python 3.10 virtual environment with PyTorch 3D installed, so I might just use that instead of going further down the rabbit hole...

Larescool commented 1 year ago

I set different version of CUDA to tackle this problem.

3tty0n commented 1 year ago

I encountered this issue when building PyTorch myself without conda on Arch Linux and saw the same error.

a downgrade of g++ may help...

As suggested by YodaEmbedding, I also think setting older versions of CC and CXX may fix this problem:

$ export CC=/usr/bin/gcc-11
$ export CXX=/usr/bin/g++-12
$ python setup.py build

I succeeded in building myself on Arch Linux with the above hack.

YodaEmbedding commented 1 year ago

Here's what I did to get things working on Arch Linux:

# Install PyTorch 1.13.1:
pip install --force-reinstall torch==1.13.1 torchvision==0.14.1

# Install CUDA 11.7:
paru -S cuda-11.7

# Install gcc10:
gpg --recv-keys 6C35B99309B5FA62  # expired keys from <2019 for gcc10
paru -S gcc10  # --chroot (optional, but may fix some issues)

# Download CUB:
(cd /tmp/ &&
  wget https://github.com/NVIDIA/cub/archive/refs/tags/2.1.0.tar.gz -O cub-2.1.0.tar.gz &&
  tar xf cub-2.1.0.tar.gz
)

export CUDA_HOME=/opt/cuda-11.7
export CC=/usr/bin/gcc-10
export CXX=/usr/bin/g++-10
export CUB_HOME=/tmp/cub-2.1.0

pip install "git+https://github.com/facebookresearch/pytorch3d.git"

Note that building (compiling+testing) gcc10 took around 10 hours on my i5 6500.

Also, I wrote the cuda-11.7 PKGBUILD based on the cuda-11.1 PKGBUILD from AUR, so there may (or may not) be issues with it.

DKatz96 commented 1 year ago

I had the same issue and found another solution.

When installing from a local clone, before running pip install -e ., go into the setup.py file in the pytorch3d dir and replace c++14 with c++17 in line 52 (extra_compile_args = {"cxx": ["-std=c++17"]}) and line 77 (nvcc_args.append("-std=c++17")).

Running pip will now compile everything using c++17. I tried a few functions and could not find any unwanted behavior, though ymmv.

danielajisafe commented 11 months ago

I had the same issue and found another solution.

When installing from a local clone, before running pip install -e ., go into the setup.py file in the pytorch3d dir and replace c++14 with c++17 in line 52 (extra_compile_args = {"cxx": ["-std=c++17"]}) and line 77 (nvcc_args.append("-std=c++17")).

Running pip will now compile everything using c++17. I tried a few functions and could not find any unwanted behavior, though ymmv.

Thanks @DKatz96. I confirm that the parameter c++17 is now set by default in a local clone. Therefore working with the installation instructions given by pytorch3d. The following solved the problem in my case.

git clone https://github.com/facebookresearch/pytorch3d.git
cd pytorch3d && pip install -e .

Install from a local clone

relh commented 8 months ago

This worked for me on ubuntu with cuda 12.1 everything:

pip install "git+https://github.com/facebookresearch/pytorch3d.git"
MiroPsota commented 8 months ago

You can try my repository for building packages and PyPI simple index and see if it works for you: https://github.com/facebookresearch/pytorch3d/discussions/1752