Closed kapara-jpg closed 5 years ago
Hello!
This is an issue with the plugin, feel free to open an issue with them :)
It seems to try and isolate a non existing GPU device=no-gpu-has-1MiB-to-run
.
My guess is that you have an error when trying to use that plugin :)
1. Issue or feature description
When trying to deploy jupyter-notebook with jupyterhub I get this error:
2019-08-08 10:02:17+00:00 [Warning] Error: failed to start container "notebook": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"process_linux.go:413: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig.real --device=no-gpu-has-1MiB-to-run --compute --utility --require=cuda>=10.1 brand=tesla,driver>=384,driver<385 brand=tesla,driver>=396,driver<397 brand=tesla,driver>=410,driver<411 --pid=13507 /var/lib/docker/overlay2/7e1caede1d313bdd0a23dbc1c841b130067c512115c9d2a15af64263d8c12c1e/merged]\\\\nnvidia-container-cli: device error: unknown device id: no-gpu-has-1MiB-to-run\\\\n\\\"\"": unknown
Im useing gpushare-device-plugin by Aliyun (Alibaba Cloud) Container Service (link)
this issues happens only when trying to deploy the notebook through k8s.
2. Steps to reproduce the issue
every time I try to create new note-book
3. Information to attach (optional if deemed irrelevant)
nvidia-container-cli -k -d /dev/tty info
I0808 10:14:55.825632 21204 nvc.c:281] initializing library context (version=1.0.2, build=ff40da533db929bf515aca59ba4c701a65a35e6b) I0808 10:14:55.825762 21204 nvc.c:255] using root / I0808 10:14:55.825781 21204 nvc.c:256] using ldcache /etc/ld.so.cache I0808 10:14:55.825810 21204 nvc.c:257] using unprivileged user 65534:65534 I0808 10:14:55.828359 21205 nvc.c:191] loading kernel module nvidia I0808 10:14:55.828999 21205 nvc.c:203] loading kernel module nvidia_uvm I0808 10:14:55.829430 21205 nvc.c:211] loading kernel module nvidia_modeset I0808 10:14:55.830193 21206 driver.c:133] starting driver service I0808 10:14:55.854227 21204 nvc_info.c:434] requesting driver information with '' I0808 10:14:55.854658 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/tls/libnvidia-tls.so.418.67 I0808 10:14:55.854777 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.418.67 over /usr/lib/x86_64-linux-gnu/tls/libnvidia-tls.so.418.67 I0808 10:14:55.854873 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.418.67 I0808 10:14:55.855004 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.418.67 I0808 10:14:55.855135 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.418.67 I0808 10:14:55.855224 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.418.67 I0808 10:14:55.855360 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.418.67 I0808 10:14:55.855487 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.418.67 I0808 10:14:55.855577 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.418.67 I0808 10:14:55.855667 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.418.67 I0808 10:14:55.855791 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-fatbinaryloader.so.418.67 I0808 10:14:55.855881 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.418.67 I0808 10:14:55.856004 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.418.67 I0808 10:14:55.856189 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.418.67 I0808 10:14:55.856287 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.418.67 I0808 10:14:55.856420 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libnvcuvid.so.418.67 I0808 10:14:55.856717 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libcuda.so.418.67 I0808 10:14:55.856902 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.418.67 I0808 10:14:55.857001 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.418.67 I0808 10:14:55.857093 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.418.67 I0808 10:14:55.857180 21204 nvc_info.c:148] selecting /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.418.67 W0808 10:14:55.857241 21204 nvc_info.c:299] missing library libvdpau_nvidia.so W0808 10:14:55.857260 21204 nvc_info.c:303] missing compat32 library libnvidia-ml.so W0808 10:14:55.857282 21204 nvc_info.c:303] missing compat32 library libnvidia-cfg.so W0808 10:14:55.857301 21204 nvc_info.c:303] missing compat32 library libcuda.so W0808 10:14:55.857323 21204 nvc_info.c:303] missing compat32 library libnvidia-opencl.so W0808 10:14:55.857338 21204 nvc_info.c:303] missing compat32 library libnvidia-ptxjitcompiler.so W0808 10:14:55.857357 21204 nvc_info.c:303] missing compat32 library libnvidia-fatbinaryloader.so W0808 10:14:55.857378 21204 nvc_info.c:303] missing compat32 library libnvidia-compiler.so W0808 10:14:55.857394 21204 nvc_info.c:303] missing compat32 library libvdpau_nvidia.so W0808 10:14:55.857416 21204 nvc_info.c:303] missing compat32 library libnvidia-encode.so W0808 10:14:55.857435 21204 nvc_info.c:303] missing compat32 library libnvidia-opticalflow.so W0808 10:14:55.857457 21204 nvc_info.c:303] missing compat32 library libnvcuvid.so W0808 10:14:55.857471 21204 nvc_info.c:303] missing compat32 library libnvidia-eglcore.so W0808 10:14:55.857488 21204 nvc_info.c:303] missing compat32 library libnvidia-glcore.so W0808 10:14:55.857510 21204 nvc_info.c:303] missing compat32 library libnvidia-tls.so W0808 10:14:55.857526 21204 nvc_info.c:303] missing compat32 library libnvidia-glsi.so W0808 10:14:55.857547 21204 nvc_info.c:303] missing compat32 library libnvidia-fbc.so W0808 10:14:55.857566 21204 nvc_info.c:303] missing compat32 library libnvidia-ifr.so W0808 10:14:55.857588 21204 nvc_info.c:303] missing compat32 library libGLX_nvidia.so W0808 10:14:55.857602 21204 nvc_info.c:303] missing compat32 library libEGL_nvidia.so W0808 10:14:55.857619 21204 nvc_info.c:303] missing compat32 library libGLESv2_nvidia.so W0808 10:14:55.857641 21204 nvc_info.c:303] missing compat32 library libGLESv1_CM_nvidia.so I0808 10:14:55.858158 21204 nvc_info.c:229] selecting /usr/bin/nvidia-smi I0808 10:14:55.858213 21204 nvc_info.c:229] selecting /usr/bin/nvidia-debugdump I0808 10:14:55.858270 21204 nvc_info.c:229] selecting /usr/bin/nvidia-persistenced I0808 10:14:55.858323 21204 nvc_info.c:229] selecting /usr/bin/nvidia-cuda-mps-control I0808 10:14:55.858376 21204 nvc_info.c:229] selecting /usr/bin/nvidia-cuda-mps-server I0808 10:14:55.858436 21204 nvc_info.c:366] listing device /dev/nvidiactl I0808 10:14:55.858454 21204 nvc_info.c:366] listing device /dev/nvidia-uvm I0808 10:14:55.858475 21204 nvc_info.c:366] listing device /dev/nvidia-uvm-tools I0808 10:14:55.858495 21204 nvc_info.c:366] listing device /dev/nvidia-modeset I0808 10:14:55.858569 21204 nvc_info.c:270] listing ipc /run/nvidia-persistenced/socket W0808 10:14:55.858612 21204 nvc_info.c:274] missing ipc /tmp/nvidia-mps I0808 10:14:55.858628 21204 nvc_info.c:490] requesting device information with '' I0808 10:14:55.864650 21204 nvc_info.c:520] listing device /dev/nvidia0 (GPU-d5951a2f-baab-0503-82b1-920531e013bc at 00000000:02:00.0) NVRM version: 418.67 CUDA version: 10.1
Device Index: 0 Device Minor: 0 Model: Quadro M4000 Brand: Quadro GPU UUID: GPU-d5951a2f-baab-0503-82b1-920531e013bc Bus Location: 00000000:02:00.0 Architecture: 5.2 I0808 10:14:55.864729 21204 nvc.c:318] shutting down library context I0808 10:14:55.865073 21206 driver.c:192] terminating driver service I0808 10:14:55.873420 21204 driver.c:233] driver service terminated successfully