shkhln / libc6-shim

Cheap glibc knockoff
MIT License
29 stars 5 forks source link

Assertion after updating to 13.2. #8

Closed verm closed 1 year ago

verm commented 1 year ago

I get an assertion now when trying to run nv-sglrun nvidia-smi. This is using nvidia-driver-525.116.03 It gives me this error:

Assertion failed: (!str_starts_with(path, "/dev/")), function shim_remove_impl, file src/libc/stdio.c, line 45.

If I remove the assertion I get:

Failed to initialize NVML: GPU access blocked by the operating system

This is using the latest port I also tried downgrading to 20211220 and got the same error.

The full log is below when trying to run:

# SHIM_DEBUG=1 nv-sglrun nvidia-smi
shim init
[22413:101945] shim_getpid()
[22413:101945] shim_getpid -> 22413
[22413:101945] shim_getenv("__NVML_DBG_LVL")
[22413:101945] shim_getenv -> 0x0
[22413:101945] shim_getenv("__NVML_DBG_APPEND")
[22413:101945] shim_getenv -> 0x0
[22413:101945] shim_getenv("__NVML_DBG_FILE")
[22413:101945] shim_getenv -> 0x0
[22413:101945] shim_gettimeofday(0x8025b4410, 0x0)
[22413:101945] shim_gettimeofday -> 0
[22413:101945] shim_memset(0x8019bcba0, 0, 12509464)
[22413:101945] shim_memset -> 0x8019bcba0
[22413:101945] shim_getenv("__NVML_CRAY_PSTATE")
[22413:101945] shim_getenv -> 0x0
[22413:101945] shim_getenv("__NVIDIA_NVML_3373")
[22413:101945] shim_getenv -> 0x0
[22413:101945] shim_getenv("__NVML_ONLY_DAEMON_PERSISTENCE_MODE")
[22413:101945] shim_getenv -> 0x0
[22413:101945] shim_getenv("__RM_ENABLE_VERBOSE_OUTPUT")
[22413:101945] shim_getenv -> 0x0
[22413:101945] shim_fopen("/proc/modules", "r")
[22413:101945] shim_fopen -> 0x0
[22413:101945] shim___xstat(1, "/sys/bus/pci/devices", 0x7fffffffb3f0)
[22413:101945] shim___xstat -> -1
[22413:101945] shim___errno_location()
[22413:101945] shim___errno_location -> 0x80090e890
[22413:101945] shim_geteuid()
[22413:101945] shim_geteuid -> 1001
[22413:101945] shim_getenv("__RM_ENABLE_VERBOSE_OUTPUT")
[22413:101945] shim_getenv -> 0x0
[22413:101945] shim___xstat(1, "/usr/bin/nvidia-modprobe", 0x7fffffffb910)
[22413:101945] shim___xstat -> -1
[22413:101945] shim_fopen("/proc/driver/nvidia/params", "r")
[22413:101945] shim_fopen -> 0x800919f70
[22413:101945] shim___isoc99_fscanf(0x800919f70, "%31[^:]: %u
", ...)
[22413:101945] shim___isoc99_fscanf -> 2
[22413:101945] shim___isoc99_fscanf(0x800919f70, "%31[^:]: %u
", ...)
[22413:101945] shim___isoc99_fscanf -> 1
[22413:101945] shim_fclose(0x800919f70)
[22413:101945] shim_fclose -> 0
[22413:101945] shim_snprintf(0x7fffffffb650, 128, "/dev/char/%d:%d", ...)
[22413:101945] shim_snprintf -> 17
[22413:101945] shim___xstat(1, "/dev/nvidiactl", 0x7fffffffb7e0)
[22413:101945] shim___xstat -> 0
[22413:101945] shim_snprintf(0x7fffffffb6d0, 128, "../%s", ...)
[22413:101945] shim_snprintf -> 12
[22413:101945] shim_remove("/dev/char/195:255")
Assertion failed: (!str_starts_with(path, "/dev/")), function shim_remove_impl, file src/libc/stdio.c, line 45.
Abort trap (core dumped)

Thank you.

shkhln commented 1 year ago

Fixed in c954193350e00937775d02eb1dac768d6b28c403. ModifyDeviceFiles: 0 alone is apparently not sufficient now.