Closed qingfengfenga closed 1 month ago
@qingfengfenga there was some work done for WSL2 in the 0.15.0 release branch. Could you test using the 0.15.0-rc.2
version instead of 0.14.5
?
Hi @qingfengfenga, I recently submitted a PR to k3d which updated the documentation for how to run CUDA workloads: https://k3d.io/v5.6.3/usage/advanced/cuda
It also updated to 0.15.0-rc.2
of the nvidia device plugin as mentioned by @elezar. In my testing on WSL it was working without issues. Do you mind testing out using the new docs and see if that fixes it?
@elezar @dbreyfogle After using 0.15.0-rc.2
, K3D on WLS2 can run CUDA workload normally. Thank you for your work and we look forward to the official release of 0.15
!
The template below is mostly useful for bug reports and support questions. Feel free to remove anything which doesn't apply to you and add more information where it makes sense.
Important Note: NVIDIA AI Enterprise customers can get support from NVIDIA Enterprise support. Please open a case here.
1. Quick Debug Information
2. Issue or feature description
Briefly explain the issue in terms of expected behavior and current behavior.
The current issue is that the nvidia device plugin pod can execute nvidia smi, but the logs indicate that the graphics card cannot be recognized.
Detailed problem description
https://github.com/justinthelaw/k3d-gpu-support/issues/1
Reference
https://github.com/k3d-io/k3d/issues/1108#issuecomment-1616065479
3. Information to attach (optional if deemed irrelevant)
Common error checking:
nvidia-smi -a
on your hostNVIDIA-SMI-LOG.txt
[X] Your docker configuration file (e.g:
/etc/docker/daemon.json
)[X] The k8s-device-plugin container logs
[ ] The kubelet logs on the node (e.g:
sudo journalctl -r -u kubelet
)Additional information that might help better understand your environment and reproduce the bug:
docker version
Docker Desktop 4.28.0 (139021)uname -a
dmesg
dpkg -l '*nvidia*'
orrpm -qa '*nvidia*'
nvidia-container-cli -V