Open zhangzhiqiangcs opened 1 month ago
Is there any detailed error information?
On ubuntu 22.04 Run nvidia-smi command, there is a message
[HAMI-core Msg(15:140394083415872:multiprocess_memory_limit.c:468)]: Calling exit handler 15
Run nvidia-debugdump -l -z
, output
[HAMI-core Warn(16:139651829421888:hook.c:278)]: Warning dlsym not found before libraries load
nvidia-debugdump: symbol lookup error: /usr/local/vgpu/libvgpu.so: undefined symbol: _dl_sym, version GLIBC_PRIVATE
On ubuntu 20.04 Run nvidia-smi command, the message is
[HAMI-core Msg(16:139892921730880:multiprocess_memory_limit.c:468)]: Calling exit handler 16
Run nvidia-debugdump -l -z
, output
[HAMI-core Warn(17:140216222119744:hook.c:278)]: Warning dlsym not found before libraries load
nvmlInit succeeded
Listing all GPUs.
Found 1 NVIDIA devices
Warning: nvmlDeviceGetSerial: Not Supported
Device ID: 0
Device name: NVIDIA xxx
Device handle: xxx
GPU internal ID: GPU-xxx
we also run pod in 22.04 but Cuda works fine.
Please provide an in-depth description of the question you have: Couldn't find documents abouts the ubuntu versions HAMi supported. But when pod's base image is ubuntu 22.04, seem's CUDA can not work. Maybe the glibc version is something related.
What do you think about this question?:
Environment: