Hello. Thanks for your selfless sharing. I have met an issue that can not install bitsandbytes. And the debugger showed the following error. Can you give some suggenssion? THANKS.
RuntimeError:
CUDA Setup failed despite GPU being available. Please run the following command to get more information:
python -m bitsandbytes
Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
[2023-11-03 13:23:53,590] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2023-11-03 13:23:53,742] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 15580 closing signal SIGTERM
WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 15581 closing signal SIGTERM
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 15579) of binary: /home/abibulla/anaconda3/envs/toga/bin/python
Traceback (most recent call last):
File "/home/abibulla/anaconda3/envs/toga/bin/torchrun", line 8, in
sys.exit(main())
File "/home/abibulla/anaconda3/envs/toga/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/abibulla/anaconda3/envs/toga/lib/python3.10/site-packages/torch/distributed/run.py", line 762, in main
run(args)
File "/home/abibulla/anaconda3/envs/toga/lib/python3.10/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/home/abibulla/anaconda3/envs/toga/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/abibulla/anaconda3/envs/toga/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Hello. Thanks for your selfless sharing. I have met an issue that can not install bitsandbytes. And the debugger showed the following error. Can you give some suggenssion? THANKS. RuntimeError: CUDA Setup failed despite GPU being available. Please run the following command to get more information:
[2023-11-03 13:23:53,590] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2023-11-03 13:23:53,742] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect) WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 15580 closing signal SIGTERM WARNING:torch.distributed.elastic.multiprocessing.api:Sending process 15581 closing signal SIGTERM ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 15579) of binary: /home/abibulla/anaconda3/envs/toga/bin/python Traceback (most recent call last): File "/home/abibulla/anaconda3/envs/toga/bin/torchrun", line 8, in
sys.exit(main())
File "/home/abibulla/anaconda3/envs/toga/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/abibulla/anaconda3/envs/toga/lib/python3.10/site-packages/torch/distributed/run.py", line 762, in main
run(args)
File "/home/abibulla/anaconda3/envs/toga/lib/python3.10/site-packages/torch/distributed/run.py", line 753, in run
elastic_launch(
File "/home/abibulla/anaconda3/envs/toga/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 132, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/abibulla/anaconda3/envs/toga/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 246, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: