Closed difenbei closed 1 year ago
Please provide the vcuda-controller log. About how to dump log, please see the FAQ of gpu-manager
@mYmNeo hi, I followed the faq to set the env, but still not get the vcuda-controller log, should set the env in POD used gpu card?
@difenbei ran into same problem, did u solve it? logs are below.
/tmp/cuda-control/src/loader.c:1102 config file: /etc/vcuda/7eadf10c1933050f72f33123c4013720907258d292e4695bbcc0732b2afa2405/vcuda.config /tmp/cuda-control/src/loader.c:1103 pid file: /etc/vcuda/7eadf10c1933050f72f33123c4013720907258d292e4695bbcc0732b2afa2405/pids.config /tmp/cuda-control/src/loader.c:1107 register to remote: pod uid: ad51fa3f-4b64-11ed-98e3-00163e144b97, cont id: 7eadf10c1933050f72f33123c4013720907258d292e4695bbcc0732b2afa2405 /tmp/cuda-control/src/loader.c:1205 pod uid : ad51fa3f-4b64-11ed-98e3-00163e144b97 /tmp/cuda-control/src/loader.c:1206 limit : 0 /tmp/cuda-control/src/loader.c:1207 container name : tensorflow-test /tmp/cuda-control/src/loader.c:1208 total utilization: 30 /tmp/cuda-control/src/loader.c:1209 total gpu memory : 4294967296 /tmp/cuda-control/src/loader.c:1210 driver version : 470.57.02 /tmp/cuda-control/src/loader.c:1211 hard limit mode : 1 /tmp/cuda-control/src/loader.c:1212 enable mode : 1 /tmp/cuda-control/src/loader.c:913 Start hijacking /tmp/cuda-control/src/loader.c:929 can't find function cuEGLInit in libcuda.so.470.57.02 /tmp/cuda-control/src/loader.c:876 can't find function nvmlDeviceGetBusType in libnvidia-ml.so.470.57.02 /tmp/cuda-control/src/loader.c:876 can't find function nvmlDeviceGetIrqNum in libnvidia-ml.so.470.57.02 /tmp/cuda-control/src/loader.c:876 can't find function nvmlVgpuInstanceGetLicenseInfo in libnvidia-ml.so.470.57.02 /tmp/cuda-control/src/loader.c:883 Hijacking nvmlInit
/tmp/cuda-control/src/hijack_call.c:466 cuInit error unknown error
but it was tested ok with driver version 460.32.03
Did you reboot your machine after upgrading your driver?
Did you reboot your machine after upgrading your driver?
thx,that‘s the point. After I reboot the machine, it works.
I tried to use vcuda on Driver Version: 470.57.02, the program may fail without warning. Does it need to be updated for cuda11.4?Thanks!