NVIDIA / nvkind

Apache License 2.0
53 stars 6 forks source link

Migrate from NVIDIA/go-nvlib to NVIDIA/go-nvml and link against nvidia-ml library #9

Open TheAifam5 opened 1 month ago

TheAifam5 commented 1 month ago

Hey,

since the release 0.3.0 of NVIDIA/go-nvlib package, the nvml has been moved to NVIDIA/go-nvml.

Without migration, nvkind was screaming about missing symbol:

/usr/bin/nvkind: symbol lookup error: /usr/bin/nvkind: undefined symbol: nvmlDeviceCcuSetStreamState

Other problem was with symbol resolving, for some reason, nvkind binary did not had a dynamic dependency on libnvidia-ml.so:

ldd ./nvkind
    linux-vdso.so.1 (0x0000759e0209b000)
    libresolv.so.2 => /usr/lib64/libresolv.so.2 (0x0000759e0206b000)
    libc.so.6 => /usr/lib64/libc.so.6 (0x0000759e01e6f000)
    /lib64/ld-linux-x86-64.so.2 (0x0000759e0209d000)

Decided to add an additional ldflags to explicitly link against the missing dependency, resulting in:

ldd ./nvkind
    linux-vdso.so.1 (0x0000777f36c7e000)
    libresolv.so.2 => /usr/lib64/libresolv.so.2 (0x0000777f36c4e000)
    libnvidia-ml.so.1 => /usr/lib64/libnvidia-ml.so.1 (0x0000777f35a00000)
    libc.so.6 => /usr/lib64/libc.so.6 (0x0000777f36a52000)
    libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x0000777f359fb000)
    libm.so.6 => /usr/lib64/libm.so.6 (0x0000777f35943000)
    libdl.so.2 => /usr/lib64/libdl.so.2 (0x0000777f3593e000)
    librt.so.1 => /usr/lib64/librt.so.1 (0x0000777f35939000)
    /lib64/ld-linux-x86-64.so.2 (0x0000777f36c80000)

Let me know if you are fine with this change.

Best regards, TheAifam5