SHI-Labs / NATTEN

Neighborhood Attention Extension. Bringing attention to a neighborhood near you!
https://shi-labs.com/natten/
Other
363 stars 29 forks source link

CMake is not using the correct nvcc binary #172

Closed jgvinholi closed 2 weeks ago

jgvinholi commented 2 weeks ago

Hi, trying to compile from source to try the branch that supports torch.compile https://github.com/alihassanijr/NATTEN-Torch/tree/fix-torch-compile-pt24, however it is not possible to compile because cmake is using the nvcc that is located in /usr/bin/nvcc, even when I set the CUDA_HOME environment variable correctly to the conda-installed version in miniconda3/envs/env_name/bin/ . Maybe in setup.py a flag should be passed to cmake to explicitly tell where nvcc is located according to the CUDA_HOME or other env variable. Here is the output of pip install git+https://github.com/alihassanijr/NATTEN-Torch.git:

Collecting git+https://github.com/alihassanijr/NATTEN-Torch.git
  Cloning https://github.com/alihassanijr/NATTEN-Torch.git to /tmp/pip-req-build-57zo0nm8
  Running command git clone --filter=blob:none --quiet https://github.com/alihassanijr/NATTEN-Torch.git /tmp/pip-req-build-57zo0nm8
  Resolved https://github.com/alihassanijr/NATTEN-Torch.git to commit 0f0b82ed7e1a48707a738ddbe8ec5fe93b596c0e
  Running command git submodule update --init --recursive -q
  Preparing metadata (setup.py) ... done
Requirement already satisfied: packaging in ./miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_vendor (from natten==0.17.2.dev0) (24.1)
Requirement already satisfied: torch>=2.0.0 in ./miniconda3/envs/chameleon/lib/python3.11/site-packages (from natten==0.17.2.dev0) (2.6.0.dev20241024)
Requirement already satisfied: filelock in ./miniconda3/envs/chameleon/lib/python3.11/site-packages (from torch>=2.0.0->natten==0.17.2.dev0) (3.16.1)
Requirement already satisfied: typing-extensions>=4.8.0 in ./miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_vendor (from torch>=2.0.0->natten==0.17.2.dev0) (4.12.2)
Requirement already satisfied: sympy==1.13.1 in ./miniconda3/envs/chameleon/lib/python3.11/site-packages (from torch>=2.0.0->natten==0.17.2.dev0) (1.13.1)
Requirement already satisfied: networkx in ./miniconda3/envs/chameleon/lib/python3.11/site-packages (from torch>=2.0.0->natten==0.17.2.dev0) (3.4.1)
Requirement already satisfied: jinja2 in ./miniconda3/envs/chameleon/lib/python3.11/site-packages (from torch>=2.0.0->natten==0.17.2.dev0) (3.1.4)
Requirement already satisfied: fsspec in ./miniconda3/envs/chameleon/lib/python3.11/site-packages (from torch>=2.0.0->natten==0.17.2.dev0) (2024.9.0)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in ./miniconda3/envs/chameleon/lib/python3.11/site-packages (from sympy==1.13.1->torch>=2.0.0->natten==0.17.2.dev0) (1.3.0)
Requirement already satisfied: MarkupSafe>=2.0 in ./miniconda3/envs/chameleon/lib/python3.11/site-packages (from jinja2->torch>=2.0.0->natten==0.17.2.dev0) (3.0.1)
Building wheels for collected packages: natten
  Building wheel for natten (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [126 lines of output]
      Building NATTEN with CUDA 124
      Building NATTEN for SM: 8.6
      Number of workers: 9
      running bdist_wheel
      running build
      running build_py
      creating build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/flops.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/functional.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/types.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/nested.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/natten1d.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/na1d.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/ops.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/natten3d.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/natten2d.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/context.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/na2d.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/__init__.py -> build/lib.linux-x86_64-cpython-311/natten
      copying src/natten/na3d.py -> build/lib.linux-x86_64-cpython-311/natten
      creating build/lib.linux-x86_64-cpython-311/natten/utils
      copying src/natten/utils/checks.py -> build/lib.linux-x86_64-cpython-311/natten/utils
      copying src/natten/utils/testing.py -> build/lib.linux-x86_64-cpython-311/natten/utils
      copying src/natten/utils/misc.py -> build/lib.linux-x86_64-cpython-311/natten/utils
      copying src/natten/utils/tensor.py -> build/lib.linux-x86_64-cpython-311/natten/utils
      copying src/natten/utils/__init__.py -> build/lib.linux-x86_64-cpython-311/natten/utils
      copying src/natten/utils/log.py -> build/lib.linux-x86_64-cpython-311/natten/utils
      creating build/lib.linux-x86_64-cpython-311/natten/autotuner
      copying src/natten/autotuner/fna_forward.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner
      copying src/natten/autotuner/misc.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner
      copying src/natten/autotuner/__init__.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner
      copying src/natten/autotuner/fna_backward.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner
      creating build/lib.linux-x86_64-cpython-311/natten/autotuner/configs
      copying src/natten/autotuner/configs/fna_forward_64x128.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner/configs
      copying src/natten/autotuner/configs/fna_backward_128x128.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner/configs
      copying src/natten/autotuner/configs/fna_forward_32x128.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner/configs
      copying src/natten/autotuner/configs/fna_forward_64x64.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner/configs
      copying src/natten/autotuner/configs/fna_backward_128x64.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner/configs
      copying src/natten/autotuner/configs/__init__.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner/configs
      copying src/natten/autotuner/configs/fna_backward_64x64.py -> build/lib.linux-x86_64-cpython-311/natten/autotuner/configs
      running build_ext
      Current arch list: [86] (max: 86)
      -- The CXX compiler identification is GNU 11.4.0
      -- The CUDA compiler identification is NVIDIA 10.1.243
      -- Detecting CXX compiler ABI info
      -- Detecting CXX compiler ABI info - done
      -- Check for working CXX compiler: /home/vinholi/miniconda3/envs/chameleon/bin/x86_64-conda-linux-gnu-c++ - skipped
      -- Detecting CXX compile features
      -- Detecting CXX compile features - done
      -- Detecting CUDA compiler ABI info
      -- Detecting CUDA compiler ABI info - done
      -- Check for working CUDA compiler: /usr/bin/nvcc - skipped
      -- Detecting CUDA compile features
      -- Detecting CUDA compile features - done
      CMake Warning (dev) at CMakeLists.txt:11 (find_package):
        Policy CMP0146 is not set: The FindCUDA module is removed.  Run "cmake
        --help-policy CMP0146" for policy details.  Use the cmake_policy command to
        set the policy and suppress this warning.

      This warning is for project developers.  Use -Wno-dev to suppress it.

      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
      -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
      -- Looking for pthread_create in pthreads
      -- Looking for pthread_create in pthreads - not found
      -- Looking for pthread_create in pthread
      -- Looking for pthread_create in pthread - found
      -- Found Threads: TRUE
      CMake Error at /home/vinholi/miniconda3/envs/chameleon/share/cmake-3.29/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
        Could NOT find CUDA: Found unsuitable version "10.1", but required is at
        least "11.0" (found /usr)
      Call Stack (most recent call first):
        /home/vinholi/miniconda3/envs/chameleon/share/cmake-3.29/Modules/FindPackageHandleStandardArgs.cmake:598 (_FPHSA_FAILURE_MESSAGE)
        /home/vinholi/miniconda3/envs/chameleon/share/cmake-3.29/Modules/FindCUDA.cmake:1291 (find_package_handle_standard_args)
        CMakeLists.txt:11 (find_package)

      -- Configuring incomplete, errors occurred!
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/tmp/pip-req-build-57zo0nm8/setup.py", line 243, in <module>
          setup(
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/__init__.py", line 117, in setup
          return distutils.core.setup(**attrs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 183, in setup
          return run_commands(dist)
                 ^^^^^^^^^^^^^^^^^^
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/core.py", line 199, in run_commands
          dist.run_commands()
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 954, in run_commands
          self.run_command(cmd)
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/dist.py", line 950, in run_command
          super().run_command(command)
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
          cmd_obj.run()
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/command/bdist_wheel.py", line 398, in run
          self.run_command("build")
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/dist.py", line 950, in run_command
          super().run_command(command)
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
          cmd_obj.run()
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/command/build.py", line 135, in run
          self.run_command(cmd_name)
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/dist.py", line 950, in run_command
          super().run_command(command)
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
          cmd_obj.run()
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/command/build_ext.py", line 98, in run
          _build_ext.run(self)
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
          self.build_extensions()
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 476, in build_extensions
          self._build_extensions_serial()
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/site-packages/setuptools/_distutils/command/build_ext.py", line 502, in _build_extensions_serial
          self.build_extension(ext)
        File "/tmp/pip-req-build-57zo0nm8/setup.py", line 219, in build_extension
          subprocess.check_call(
        File "/home/vinholi/miniconda3/envs/chameleon/lib/python3.11/subprocess.py", line 413, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['cmake', '/tmp/pip-req-build-57zo0nm8/csrc', '-DPYTHON_PATH=/home/vinholi/miniconda3/envs/chameleon/bin/python', '-DOUTPUT_FILE_NAME=natten/libnatten.cpython-311-x86_64-linux-gnu', '-DNATTEN_CUDA_ARCH_LIST=86-real', '-DNATTEN_IS_WINDOWS=0', '-DNATTEN_IS_MAC=0', '-DIS_LIBTORCH_BUILT_WITH_CXX11_ABI=0', '-DNATTEN_WITH_AVX=1', '-DNATTEN_WITH_CUDA=1', '-DNATTEN_WITH_CUTLASS=1']' returned non-zero exit status 1.
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for natten
  Running setup.py clean for natten
Failed to build natten
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (natten)
alihassanijr commented 2 weeks ago

I would strongly advise against using that branch. Torch compile support is not yet stable, and that branch is not yet ready to merge.

As for your issues with building NATTEN from source, I'm afraid it's because the only detected CUDA toolkit in your environment is CTK 10.1, but NATTEN requires 11.0 and above.

Compilation is not only a matter of where nvcc is, you'd need to have the "correct" version of CTK loaded into your environment variables so that cmake can also pick up the include and lib dirs so that it can build and link.

NATTEN simply uses FindCUDA.cmake, which is the standard way of finding CTK and the cuda driver, and its behavior is just out of our control.

jgvinholi commented 2 weeks ago

Yes, the cuda toolkit version used by /usr/bin/nvcc is 10.1, which is unsupported. But what I suggest is to tell cmake to use the nvcc version located in CUDA_HOME env var, since many people like me use the cuda toolkit/nvcc obtained directly from conda. This could be a flag added to cmake that could be added to setup.py. Of course I could install an updated cuda toolkit in my system, but it might not be ideal to change the system cuda toolkit as other packages might have different requirements. Thank you for your time.

alihassanijr commented 2 weeks ago

Unfortunately it's not that simple. You don't just need a compiler, you also need the cuda include dirs to be able to compile, and linking with cuda runtime and pytorch post-compilation. CMake handles a lot of this through packages like FindCUDA (and this is done with most such projects and not unique to NATTEN; pytorch uses FindCUDA), and the behavior of FindCUDA is unfortunately out of our control.

I recommend either setting your environment variables in a way that cmake accepts them, or just downloading the latest release which doesn't require you to build anything.

If you're installing CTK through conda, then you shouldn't have to modify any environment variables; conda should do that for you.

jgvinholi commented 2 weeks ago

Thank you for clearing this up. I will try to check my conda installation and see what is going on.