Chia-Network / bladebit

A high-performance k32-only, Chia (XCH) plotter supporting in-RAM and disk-based plotting
Apache License 2.0
336 stars 109 forks source link

CUDA driver version is insufficient for CUDA runtime version #283

Open cemalefendi opened 1 year ago

cemalefendi commented 1 year ago

Bladebit Chia Plotter Version : 3.0.0-alpha1 Git Commit : f269db0a7ad307514e993c335897cea7ebf46eda Compiled With: gcc 9.4.0

[Global Plotting Config] Will create 1 plots. Thread count : 128 Warm start enabled : false NUMA disabled : false CPU affinity disabled : false Farmer public key : 849c17604e2d1fd1b6fc9e89993329d1da2391c88e2eaf2d058ecce8c2246f581f2498dd2c9fd453b1492262cbdfea8a Pool contract address : xch1acu6vwdhr99nk6n9aqn7r7x8hlgedqjpxfvys3fthzy5y4qekctqka50gy Benchmark mode : disabled

[Bladebit CUDA Plotter] Failed to fetch CUDA devices.

CUDA error: 35 (0x23) cudaErrorInsufficientDriver : CUDA driver version is insufficient for CUDA runtime version

Panic!!! Fatal Error:
CUDA error cudaErrorInsufficientDriver : CUDA driver version is insufficient for CUDA runtime version. ./bladebit_cuda(+0xcf8cb)[0x562e98ef28cb] ./bladebit_cuda(+0xcf0af)[0x562e98ef20af] ./bladebit_cuda(+0x207d2)[0x562e98e437d2] ./bladebit_cuda(+0x1b95d)[0x562e98e3e95d] ./bladebit_cuda(+0x18097)[0x562e98e3b097] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f75a5f8a083] ./bladebit_cuda(+0x1974e)[0x562e98e3c74e]

a@a-System-Product-Name:~/Desktop/bladebitcuda$ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2019 NVIDIA Corporation Built on Sun_Jul_28_19:07:16_PDT_2019 Cuda compilation tools, release 10.1, V10.1.243

ubuntu version is 22.04 amd 3995wx 512gb ram 3080ti gpu 8 tb raid0 nvm

happycouak commented 1 year ago

Same issue here

Ubuntu 22.04 Driver Version: 470.161.03 CUDA Version: 11.4 Tesla P4 8GB

Any pointers would be welcome

happycouak commented 1 year ago

nvidia drivers seems to be a big mess ...

ubuntu-drivers devices
== /sys/devices/pci0000:40/0000:40:02.0/0000:42:00.0 ==
modalias : pci:v000010DEd00001BB3sv000010DEsd000011D8bc03sc02i00
vendor   : NVIDIA Corporation
model    : GP104GL [Tesla P4]
driver   : nvidia-driver-470-server - distro non-free
driver   : nvidia-driver-450-server - distro non-free
driver   : nvidia-driver-470 - distro non-free recommended  <======
driver   : nvidia-driver-390 - distro non-free
driver   : nvidia-driver-418-server - distro non-free
driver   : xserver-xorg-video-nouveau - distro free builtin

It show that nvidia-driver-470 should be installed, but:

apt install nvidia-driver-470-server  nvidia-cuda-dev nvidia-cuda-toolkit
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 libnvidia-compute-495 : Depends: libnvidia-compute-510 but it is not installable
 nvidia-cuda-dev : Breaks: libcuda1 (< 495)
                   Recommends: libnvcuvid1 but it is not installable
E: Unable to correct problems, you have held broken packages.

The libcuda version constraint is not respected.

happycouak commented 1 year ago

I sorted it out by installing cuda from here: https://developer.nvidia.com/cuda-downloads. Now nvidia-smi look like this:

Sat Feb 18 00:29:59 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12    Driver Version: 525.85.12    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla P4            On   | 00000000:42:00.0 Off |                  Off |
| N/A   29C    P8     6W /  75W |      0MiB /  8192MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+