Closed Crema-new closed 3 months ago
i found that i can use nvidia-docker without error log
root@inspur:/home/devops/ais.stat# docker run --runtime=nvidia f5ac1ad505db nvidia-smi
Wed Aug 7 06:46:37 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.06 Driver Version: 535.183.06 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Tesla T4 Off | 00000000:B1:00.0 Off | 0 |
| N/A 72C P0 32W / 70W | 2MiB / 15360MiB | 7% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
thus i thought the problem must be in docker-compose or its config after i checked the "docker-compose.yml", i found the application was assigned two gpus which actually is only one
i meet the same issue as #416
error out
the same version of ubuntu
lower docker version
daemon.json
nvidia info
permissions of the device nodes
i've tried reinstalling the driver several times. but the same error out