Open garyyang85 opened 9 months ago
@garyyang85 No devices were found
typically indicates that GPU initialization failed. Can you get system logs by running dmesg | grep -i nvrm
on the host?
I have "[189160.303788] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 550.54.15" from dmesg | grep -i nvrm
1. Quick Debug Information
2. Issue or feature description
nvidia-driver-daemonset-xx pod reports "Startup probe failed: No devices were found" in events, but I can see the v100 GPU is ready on the os, below is the "lspci" output
3. Steps to reproduce the issue
Deploy the GPU operator, cluster-policy definition.