CERN / TIGRE

TIGRE: Tomographic Iterative GPU-based Reconstruction Toolbox
BSD 3-Clause "New" or "Revised" License
581 stars 190 forks source link

Complie fail when intall TIGRE in the server #559

Closed GreameLee closed 4 months ago

GreameLee commented 5 months ago

When I try to install TIGRE Python on the supercomputer the compiling failed and got this error:

(ss) exouser@sit-new:~/SiT/TIGRE/Python$ pip install .
Processing /home/exouser/SiT/TIGRE/Python
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [18 lines of output]
      Traceback (most recent call last):
        File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 487, in run_setup
          super().run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 127, in <module>
        File "<string>", line 90, in locate_cuda
      OSError: CUDA_HOME or CUDA_PATH not set
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

OS: ubuntu 22.04 GCC: x86_64-linux-gnu-gcc-12 CUDA:12.2

I guess there is some problem with the setup.py and he available cuda version for the server is:

(ss) exouser@sit-new:~/SiT/TIGRE/Python$ module spider cuda

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  nvhpc/23.11:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     Versions:
        nvhpc/23.11/nvhpc-byo-compiler
        nvhpc/23.11/nvhpc-hpcx-cuda11
        nvhpc/23.11/nvhpc-hpcx-cuda12
        nvhpc/23.11/nvhpc-hpcx
        nvhpc/23.11/nvhpc-nompi
        nvhpc/23.11/nvhpc-openmpi3
        nvhpc/23.11/nvhpc
AnderBiguri commented 5 months ago

Hi, It looks like CUDA is not installed, or installed in a non-standard way. CUDA_HOME or CUDA_PATH are environmental variables that are set by CUDA (or you need to set them according to the instalation instructions).

On Tue, 18 Jun 2024, 17:28 Haodong, @.***> wrote:

When I try to install TIGRE Python on the supercomputer the compiling failed and got this error:

(ss) @.***:~/SiT/TIGRE/Python$ pip install . Processing /home/exouser/SiT/TIGRE/Python Installing build dependencies ... done Getting requirements to build wheel ... error error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> [18 lines of output] Traceback (most recent call last): File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in main() File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main json_out['return_val'] = hook(**hook_input['kwargs']) File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel return hook(config_settings) File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel return self._get_build_requires(config_settings, requirements=['wheel']) File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires self.run_setup() File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 487, in run_setup super().run_setup(setup_script=setup_script) File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 311, in run_setup exec(code, locals()) File "", line 127, in File "", line 90, in locate_cuda OSError: CUDA_HOME or CUDA_PATH not set [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error

OS: ubuntu 22.04 GCC: x86_64-linux-gnu-gcc-12 CUDA:12.2

I guess there is some problem with the setup.py and he available cuda version for the server is:

(ss) @.***:~/SiT/TIGRE/Python$ module spider cuda


nvhpc/23.11:

 Versions:
    nvhpc/23.11/nvhpc-byo-compiler
    nvhpc/23.11/nvhpc-hpcx-cuda11
    nvhpc/23.11/nvhpc-hpcx-cuda12
    nvhpc/23.11/nvhpc-hpcx
    nvhpc/23.11/nvhpc-nompi
    nvhpc/23.11/nvhpc-openmpi3
    nvhpc/23.11/nvhpc

— Reply to this email directly, view it on GitHub https://github.com/CERN/TIGRE/issues/559, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC2OENHXVT35MD6EEK42AETZIBN3DAVCNFSM6AAAAABJQMJHAWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3DAMRVHEZTMMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

GreameLee commented 5 months ago

nvidia-smi is good and torch.cuda.is_available is also good.The cuda has been installed

GreameLee commented 5 months ago

Hello, Ander, the original error is:

(ss) exouser@sit-new:~/SiT/TIGRE/Python$ pip install .
Processing /home/exouser/SiT/TIGRE/Python
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error

  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [21 lines of output]
      Traceback (most recent call last):
        File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
        File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
          return hook(config_settings)
        File "/tmp/pip-build-env-9xseo1re/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
          return self._get_build_requires(config_settings, requirements=['wheel'])
        File "/tmp/pip-build-env-9xseo1re/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
          self.run_setup()
        File "/tmp/pip-build-env-9xseo1re/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 487, in run_setup
          super().run_setup(setup_script=setup_script)
        File "/tmp/pip-build-env-9xseo1re/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 311, in run_setup
          exec(code, locals())
        File "<string>", line 125, in <module>
        File "<string>", line 106, in locate_cuda
        File "<string>", line 60, in get_cuda_version
        File "/home/exouser/.conda/envs/ss/lib/python3.8/posixpath.py", line 76, in join
          a = os.fspath(a)
      TypeError: expected str, bytes or os.PathLike object, not NoneType
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
AnderBiguri commented 5 months ago

Cuda when packaged with pytorch is not always the full cuda.

Can you do "which nvcc"? If there is no nvcc then you have not installed cuda. You can check the installation instructions for how to do it.

GreameLee commented 5 months ago

This server did not support "nvcc -V" commend

AnderBiguri commented 5 months ago

Apologies, I don't understand what you mean with that last comment. If the server has a NVIDIA GPU, then it supports nvcc. But maybe its not installed, which is what I am trying to figure out, to help.

GreameLee commented 5 months ago

nvcc is not installed. And I try to install it before by: sudo apt install nvidia-cuda-toolkit But it will change all the virtual environment and nividia-smi can not be used

So I create a new account to login the surpercomputer

AnderBiguri commented 5 months ago

CUDA is not a python package. When you install the runtime libraries with e.g. pytorch, it comes in a virtual enviroment, but raw CUDA compiler, i.e. nvcc can not be installed "in an enviroment" in the same way that gcc can not.

Please, instead of trying random things do follow the instructions in TIGRE to install CUDA. https://developer.nvidia.com/cuda-downloads

If you would have done so, we would not need to have this conversation :) Much easier for both of us!