Closed ZeWang95 closed 6 months ago
Hi, Seems like torch may have been installed with rocm6.0 wheels. Please reinstall with 6.1: pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.1/
You can refer to this page for latest instructions: https://rocm.docs.amd.com/projects/install-on-linux/en/latest/how-to/3rd-party/pytorch-install.html
Thank you! Problem solved! Thanks for the great efforts on adapting this package with rocm. It helps a lot.
Hi!
Thanks for the effort!
I installed the package based on the instructions. And I get the following error message when importing the pacakge:
Could not find the bitsandbytes CUDA binary at PosixPath('/opt/conda/envs/diff/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_hip_nohipblaslt.so') Could not load bitsandbytes native library: /opt/conda/envs/diff/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory Traceback (most recent call last): File "/opt/conda/envs/diff/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 124, in
lib = get_native_library()
File "/opt/conda/envs/diff/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 104, in get_native_library
dll = ct.cdll.LoadLibrary(str(binary_path))
File "/opt/conda/envs/diff/lib/python3.10/ctypes/init.py", line 452, in LoadLibrary
return self._dlltype(name)
File "/opt/conda/envs/diff/lib/python3.10/ctypes/init.py", line 374, in init
self._handle = _dlopen(self._name, mode)
OSError: /opt/conda/envs/diff/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so: cannot open shared object file: No such file or directory
CUDA Setup failed despite CUDA being available. Please run the following command to get more information:
python -m bitsandbytes
Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues
And when I run my training code, I got an error: NameError: name 'str2optimizer8bit_blockwise' is not defined
I'm using rocm 6.1 and pytorch 2.3.
Any idea how to resolve this issue?
Thanks! Ze