4paradigm / k8s-vgpu-scheduler

OpenAIOS vGPU device plugin for Kubernetes is originated from the OpenAIOS project to virtualize GPU device memory, in order to allow applications to access larger memory space than its physical capacity. It is designed for ease of use of extended device memory for AI workloads.
Apache License 2.0
489 stars 93 forks source link

Error: failed to create FS watcher: no such file or directory #6

Closed qifengz closed 2 years ago

qifengz commented 3 years ago

2021/08/26 07:14:50 Loading PciInfo 0 = 00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02) 1 = 00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II] 2 = 00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II] 3 = 00:01.2 USB controller: Intel Corporation 82371SB PIIX3 USB [Natoma/Triton II] (rev 01) 4 = 00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03) 5 = 00:02.0 VGA compatible controller: Cirrus Logic GD 5446 6 = 00:03.0 Multimedia audio controller: Intel Corporation 82801AA AC'97 Audio Controller (rev 01) 7 = 00:04.0 Communication controller: Red Hat, Inc. Virtio console 8 = 00:05.0 SCSI storage controller: Red Hat, Inc. Virtio block device 9 = 00:06.0 SCSI storage controller: Red Hat, Inc. Virtio block device 10 = 00:07.0 SCSI storage controller: Red Hat, Inc. Virtio block device 11 = 00:08.0 Ethernet controller: Red Hat, Inc. Virtio network device 12 = 00:09.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev a1) found 00:09.0 13 = 00:0a.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev a1) found 00:0a.0 14 = 00:0b.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev a1) found 00:0b.0 15 = 00:0c.0 3D controller: NVIDIA Corporation GV100GL [Tesla V100 SXM2 16GB] (rev a1) found 00:0c.0 16 = 00:0d.0 Unclassified device [00ff]: Red Hat, Inc. Virtio memory balloon 17 = 00:0e.0 Ethernet controller: Red Hat, Inc. Virtio network device 18 = 00:0f.0 Ethernet controller: Red Hat, Inc. Virtio network device 19 = 00:10.0 Ethernet controller: Red Hat, Inc. Virtio network device 2021/08/26 07:14:50 Loading NVML 20 = 00:11.0 Ethernet controller: Red Hat, Inc. Virtio network device 21 = 00:12.0 Ethernet controller: Red Hat, Inc. Virtio network device 22 = 00:13.0 Ethernet controller: Red Hat, Inc. Virtio network device 23 = 00:14.0 Ethernet controller: Red Hat, Inc. Virtio network device 24 = pcibusstr= 00:09.0 00:0a.0 00:0b.0 00:0c.0

2021/08/26 07:14:50 Starting FS watcher. 2021/08/26 07:14:50 Shutdown of NVML returned: 2021/08/26 07:14:50 Error: failed to create FS watcher: no such file or directory

Driver Version: 440.64.00

archlitchi commented 3 years ago

你好,你的kubelet安装或版本可能有一些问题,正常情况下,应该在node节点上存在/var/lib/kubelet/device-plugins目录的

qifengz commented 3 years ago

ok, thanks.