Open Fvoiretryzig opened 2 years ago
哥们这个 问题解决了吗
哥们这个 问题解决了吗
@phoenixwu0229 还没有
我在openshift4上也遇到这个问题,我按照faq说明修改了container-runtime-endpoint以及cgroup为systemd
value: "--logtostderr=false --container-runtime-endpoint=/var/run/crio/crio.sock --cgroup-driver=systemd"
然后容器启动就报错: rebuild ldcache launch gpu manager E0516 02:59:32.771447 1270729 server.go:131] Unable to set Type=notify in systemd service file? F0516 02:59:33.872799 1270729 tree.go:102] Can not initialize nvidia tree, err no input goroutine 10 [running]: k8s.io/klog.stacks(0xc000109c00, 0xc000016000, 0x58, 0x193) /go/pkg/mod/k8s.io/klog@v1.0.0/klog.go:875 +0xb8 k8s.io/klog.(loggingT).output(0x27ae5a0, 0xc000000003, 0xc0001c0230, 0x250db7f, 0x7, 0x66, 0x0) /go/pkg/mod/k8s.io/klog@v1.0.0/klog.go:826 +0x330 k8s.io/klog.(loggingT).printf(0x27ae5a0, 0x3, 0x17d4c8c, 0x26, 0xc0003ebe30, 0x1, 0x1) /go/pkg/mod/k8s.io/klog@v1.0.0/klog.go:707 +0x14b k8s.io/klog.Fatalf(...) /go/pkg/mod/k8s.io/klog@v1.0.0/klog.go:1276 tkestack.io/gpu-manager/pkg/device/nvidia.(NvidiaTree).Init(0xc0001c6140, 0x0, 0x0) /root/rpmbuild/BUILD/gpu-manager-1.1.5/pkg/device/nvidia/tree.go:102 +0x128 tkestack.io/gpu-manager/pkg/server.(managerImpl).Run(0xc00004a7c0, 0xc000136dc0, 0x0) /root/rpmbuild/BUILD/gpu-manager-1.1.5/pkg/server/server.go:171 +0x66b created by tkestack.io/gpu-manager/cmd/manager/app.Run /root/rpmbuild/BUILD/gpu-manager-1.1.5/cmd/manager/app/app.go:83 +0x3da
我在openshift4上也遇到这个问题,我按照faq说明修改了container-runtime-endpoint以及cgroup为systemd
- name: EXTRA_FLAGS
value: "--logtostderr=false"
value: "--logtostderr=false --container-runtime-endpoint=/var/run/crio/crio.sock --cgroup-driver=systemd"
然后容器启动就报错: rebuild ldcache launch gpu manager E0516 02:59:32.771447 1270729 server.go:131] Unable to set Type=notify in systemd service file? F0516 02:59:33.872799 1270729 tree.go:102] Can not initialize nvidia tree, err no input goroutine 10 [running]: k8s.io/klog.stacks(0xc000109c00, 0xc000016000, 0x58, 0x193) /go/pkg/mod/k8s.io/klog@v1.0.0/klog.go:875 +0xb8 k8s.io/klog.(loggingT).output(0x27ae5a0, 0xc000000003, 0xc0001c0230, 0x250db7f, 0x7, 0x66, 0x0) /go/pkg/mod/k8s.io/klog@v1.0.0/klog.go:826 +0x330 k8s.io/klog.(loggingT).printf(0x27ae5a0, 0x3, 0x17d4c8c, 0x26, 0xc0003ebe30, 0x1, 0x1) /go/pkg/mod/k8s.io/klog@v1.0.0/klog.go:707 +0x14b k8s.io/klog.Fatalf(...) /go/pkg/mod/k8s.io/klog@v1.0.0/klog.go:1276 tkestack.io/gpu-manager/pkg/device/nvidia.(NvidiaTree).Init(0xc0001c6140, 0x0, 0x0) /root/rpmbuild/BUILD/gpu-manager-1.1.5/pkg/device/nvidia/tree.go:102 +0x128 tkestack.io/gpu-manager/pkg/server.(managerImpl).Run(0xc00004a7c0, 0xc000136dc0, 0x0) /root/rpmbuild/BUILD/gpu-manager-1.1.5/pkg/server/server.go:171 +0x66b created by tkestack.io/gpu-manager/cmd/manager/app.Run /root/rpmbuild/BUILD/gpu-manager-1.1.5/cmd/manager/app/app.go:83 +0x3da
Try to install the NVIDIA GPU driver first.
same problem , v1.9.0
gpu-manager 1.0.9 & v1.1.5
请问,在jetson上跑起来了吗?我也遇到这个问题了
I compile gpu-manager to arm64 and run it on jetson nano. However, when I run
kubectl create -f gpu-manager.yaml
, it showsAccording to 7#issue and 40#issue, I modify the yaml file and ensure docker runtime is runc not nvidia-container-runtime. This is my yaml file:
I copy the .kube directory in master node to each work node. How can I deal with this error