Closed GreameLee closed 4 months ago
Hi, It looks like CUDA is not installed, or installed in a non-standard way. CUDA_HOME or CUDA_PATH are environmental variables that are set by CUDA (or you need to set them according to the instalation instructions).
On Tue, 18 Jun 2024, 17:28 Haodong, @.***> wrote:
When I try to install TIGRE Python on the supercomputer the compiling failed and got this error:
(ss) @.***:~/SiT/TIGRE/Python$ pip install . Processing /home/exouser/SiT/TIGRE/Python Installing build dependencies ... done Getting requirements to build wheel ... error error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> [18 lines of output] Traceback (most recent call last): File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in
main() File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main json_out['return_val'] = hook(**hook_input['kwargs']) File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel return hook(config_settings) File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel return self._get_build_requires(config_settings, requirements=['wheel']) File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires self.run_setup() File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 487, in run_setup super().run_setup(setup_script=setup_script) File "/tmp/pip-build-env-ahxu88yn/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 311, in run_setup exec(code, locals()) File " ", line 127, in File " ", line 90, in locate_cuda OSError: CUDA_HOME or CUDA_PATH not set [end of output] note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error
OS: ubuntu 22.04 GCC: x86_64-linux-gnu-gcc-12 CUDA:12.2
I guess there is some problem with the setup.py and he available cuda version for the server is:
(ss) @.***:~/SiT/TIGRE/Python$ module spider cuda
nvhpc/23.11:
Versions: nvhpc/23.11/nvhpc-byo-compiler nvhpc/23.11/nvhpc-hpcx-cuda11 nvhpc/23.11/nvhpc-hpcx-cuda12 nvhpc/23.11/nvhpc-hpcx nvhpc/23.11/nvhpc-nompi nvhpc/23.11/nvhpc-openmpi3 nvhpc/23.11/nvhpc
— Reply to this email directly, view it on GitHub https://github.com/CERN/TIGRE/issues/559, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC2OENHXVT35MD6EEK42AETZIBN3DAVCNFSM6AAAAABJQMJHAWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM3DAMRVHEZTMMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
nvidia-smi is good and torch.cuda.is_available is also good.The cuda has been installed
Hello, Ander, the original error is:
(ss) exouser@sit-new:~/SiT/TIGRE/Python$ pip install .
Processing /home/exouser/SiT/TIGRE/Python
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [21 lines of output]
Traceback (most recent call last):
File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
main()
File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/exouser/.conda/envs/ss/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
return hook(config_settings)
File "/tmp/pip-build-env-9xseo1re/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 325, in get_requires_for_build_wheel
return self._get_build_requires(config_settings, requirements=['wheel'])
File "/tmp/pip-build-env-9xseo1re/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 295, in _get_build_requires
self.run_setup()
File "/tmp/pip-build-env-9xseo1re/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 487, in run_setup
super().run_setup(setup_script=setup_script)
File "/tmp/pip-build-env-9xseo1re/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 311, in run_setup
exec(code, locals())
File "<string>", line 125, in <module>
File "<string>", line 106, in locate_cuda
File "<string>", line 60, in get_cuda_version
File "/home/exouser/.conda/envs/ss/lib/python3.8/posixpath.py", line 76, in join
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
Cuda when packaged with pytorch is not always the full cuda.
Can you do "which nvcc"? If there is no nvcc then you have not installed cuda. You can check the installation instructions for how to do it.
This server did not support "nvcc -V" commend
Apologies, I don't understand what you mean with that last comment. If the server has a NVIDIA GPU, then it supports nvcc. But maybe its not installed, which is what I am trying to figure out, to help.
nvcc is not installed. And I try to install it before by: sudo apt install nvidia-cuda-toolkit But it will change all the virtual environment and nividia-smi can not be used
So I create a new account to login the surpercomputer
CUDA is not a python package. When you install the runtime libraries with e.g. pytorch, it comes in a virtual enviroment, but raw CUDA compiler, i.e. nvcc can not be installed "in an enviroment" in the same way that gcc can not.
Please, instead of trying random things do follow the instructions in TIGRE to install CUDA. https://developer.nvidia.com/cuda-downloads
If you would have done so, we would not need to have this conversation :) Much easier for both of us!
When I try to install TIGRE Python on the supercomputer the compiling failed and got this error:
OS: ubuntu 22.04 GCC: x86_64-linux-gnu-gcc-12 CUDA:12.2
I guess there is some problem with the setup.py and he available cuda version for the server is: