-
The issue described in https://github.com/NVIDIA/nvidia-container-toolkit/issues/48 which is locked - it states
> A fix will be present in the next patch release of all supported NVIDIA GPU drivers…
-
目前问题:pod向queue提交资源申请会出现queue资源足够但是pod无法调度
case:
集群资源:cpu:2 内存:2 ScalarResources:"nvidia/gpu":8
q1 capability:cpu:0 内存:0 ScalarResources:"nvidia/gpu":0 weight:1
q2 capability:cpu:1 内存:1 …
-
### NVIDIA Open GPU Kernel Modules Version
530.41.03
### Does this happen with the proprietary driver (of the same version) as well?
Yes
### Operating System and Version
Arch Linux
#…
dllu updated
10 hours ago
-
1. Quick Debug Information
OS/Version(e.g. RHEL8.6, Ubuntu22.04): Amazon Linux 2
Kernel Version: 5.10.217-205.860.amzn2
Container Runtime Type/Version(e.g. Containerd, CRI-O, Docker): Containerd 1.…
-
This is my GPU information
```
$ nvidia-smi
Tue Jun 25 15:36:28 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Versi…
-
there are two gpu device on the kubenetes node,
the timeSlicing.replicas is two,
the nvidia.com/gpu of large langurage model is two,
the nvidia.com/gpu of other models are one,
but the pod of larg…
-
- kubectl -n gpu-operator logs -f nvidia-operator-validator-mjzpl -c toolkit-validation
```
time="2024-06-20T01:20:55Z" level=info msg="version: 762213f2"
NVIDIA-SMI couldn't find libnvidia-ml.so l…
-
### What happened?
When using the `ResourceQuotas` admission controller for extended resources (`nvidia.com/gpu`) the reported used resources are inconsistently and wrong.
### What did you expect to…
-
Having generic hardware modules profiles available allows us to include them automatically
in tools like nixos-generate-config and co.
Hence I am proposing to upstream all our modules in `common` …
-
### Is there an existing issue for this?
- [X] I searched the existing issues and did not find anything similar.
### Current Behavior
When an Nvidia GPU is used, nothing is displayed other than the…