Closed blue-cat-whale closed 6 months ago
Hmm, can you show the output of nvidia-smi
on this system please?
nvidia-smi
returns nothing. I use a cloud/shared GPU.
Do you know what version of the cuda driver is installed on this system? Or if there's anywhere we can look to see details of the virtualisation setup?
It might be possible that some information is available if we launch the process under gdb. Assuming gdb is already installed (if not you'll have to install it using your operating system's package management), what does the following show?
gdb -ex run --args python -c "import cudf"
?
Thanks
I installed CUDA12.4 locally
[root@localhost code]# whereis nvcc
nvcc: /usr/local/cuda-12.4/bin/nvcc /usr/local/cuda-12.4/bin/nvcc.profile
But that cloud GPU has multiple CUDA installed, I'm not sure which one is active.
[root@localhost code]# ls /opt/orion/orion_runtime/gpu/cuda/current/orion-cuda* -d
/opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-11.0 /opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-11.3 /opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-11.6 /opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-12.0
/opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-11.1 /opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-11.4 /opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-11.7 /opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-12.1
/opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-11.2 /opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-11.5 /opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-11.8 /opt/orion/orion_runtime/gpu/cuda/current/orion-cuda-12.2
I'm an end-user and I don't know how this cloud GPU is set up on the server side. This is how I set up it as an end user.
wget http://<private_url>/dev/rpm/orionx-cuda-4.2.0-1.all.rpm
wget http://<private_url>/dev/rpm/orionx-engine-4.2.0- 1.all.rpm
wget http://<private_url>/dev/rpm/orionx-runtime-4.2.0- 1.all.rpm
rpm -i ./orionx-engine-4.2.0-1.all.rpm
rpm -i ./orionx-cuda-4.2.0-1.all.rpm
rpm -i ./orionx-runtime-4.2.0-1.all.rpm
systemctl start oriond
export ORION_CLIENT_ID=client-id
export ORION_VGPU=1
export ORION_GMEM=10000
export ORION_RATIO=100
export ORION_DEVICE_ENABLE=1
export ORION_RESERVED=0
export LD_LIBRARY_PATH=/opt/orion/orion_runtime/gpu/cuda/current
If you are seeing nvidia-smi
produce CUDA Version: N/A
, that's not a good sign. I'm not sure what that means, but it could be an issue with your drivers? I would first try compiling and running a basic CUDA program. Something like this hello world example. I also recommmend reaching out to your cloud provider for support.
Closing as stale. Please reopen if needed.
I've downloaded
cudf==24.4.0a
on my RHEL 8.9, but when I triedimport cudf
like this tutorial, the python console just crushed. How can I fix it?