Open bladernr opened 2 years ago
I've encountered same error with ubuntu2204, cuda11.8/12.1.
After debugging, it seems that GPU cannot be initialized and the cuInit(0) returns 999.
And soon I realized that this may be cause by the built in NVIDIA drivers.
I tried apt install -y nvidia-cuda-toolkit nvidia-modprobe
and nvidia-modprobe -u
. Then update the Makefile's CUDAPATH and NVCC path. And It works.
Hope this will help a little.
$ ./gpu_burn 30
Burning for 30 seconds.
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
terminate called after throwing an instance of 'std::cxx11::basic_string<char, std::char_traits, std::allocator >' cxx11::basic_string<char, std::char_traits, std::allocator >', std::allocator >' cxx11::basic_string<char, std::char_traits, std::allocator >', std::allocator >' cxx11::basic_string<char, std::char_traits, std::allocator >', std::allocator >terminate called after throwing an instance of
'terminate called after throwing an instance of 'std:: cxx11::basic_string<char, std::char_traits, std::allocator >', std::allocator >' cxx11::basic_string<char, std::char_traits, std::allocator >', std::allocator >' cxx11::basic_string<char, std::char_traits, std::allocator >', std::allocator >' cxx11::basic_string<char, std::char_traits, std::allocator >', std::allocator >' cxx11::basic_string<char, std::char_traits, std::allocator >', std::allocator >' cxx11::basic_string<char, std::char_traits, std::allocator
terminate called after throwing an instance of 'std::
terminate called after throwing an instance of 'std::cxx11::basic_string<char, std::char_traits
terminate called after throwing an instance of 'std::
terminate called after throwing an instance of 'std::cxx11::basic_string<char, std::char_traits
terminate called after throwing an instance of 'std::
terminate called after throwing an instance of 'std::cxx11::basic_string<char, std::char_traits
terminate called after throwing an instance of 'std::cxx11::basic_string<char, std::char_traits
terminate called after throwing an instance of 'std::
std::cxx11::basic_string<char, std::char_traits
terminate called after throwing an instance of 'std::
'
terminate called after throwing an instance of 'std::cxx11::basic_string<char, std::char_traits
terminate called after throwing an instance of 'std::
terminate called after throwing an instance of 'std::cxx11::basic_string<char, std::char_traits
terminate called after throwing an instance of 'std::
terminate called after throwing an instance of 'std::cxx11::basic_string<char, std::char_traits
terminate called after throwing an instance of 'terminate called after throwing an instance of 'std::
I eventually had to CTRL-C out of this. It's on Ubuntu 22.04 with the latest gpu_burn source and cuda toolkit installed. I'm doing some bug testing of a wrapper I am using, when I hit this.