Closed verm closed 1 year ago
I get an assertion now when trying to run nv-sglrun nvidia-smi. This is using nvidia-driver-525.116.03 It gives me this error:
nv-sglrun nvidia-smi
nvidia-driver-525.116.03
Assertion failed: (!str_starts_with(path, "/dev/")), function shim_remove_impl, file src/libc/stdio.c, line 45.
If I remove the assertion I get:
Failed to initialize NVML: GPU access blocked by the operating system
This is using the latest port I also tried downgrading to 20211220 and got the same error.
The full log is below when trying to run:
# SHIM_DEBUG=1 nv-sglrun nvidia-smi shim init [22413:101945] shim_getpid() [22413:101945] shim_getpid -> 22413 [22413:101945] shim_getenv("__NVML_DBG_LVL") [22413:101945] shim_getenv -> 0x0 [22413:101945] shim_getenv("__NVML_DBG_APPEND") [22413:101945] shim_getenv -> 0x0 [22413:101945] shim_getenv("__NVML_DBG_FILE") [22413:101945] shim_getenv -> 0x0 [22413:101945] shim_gettimeofday(0x8025b4410, 0x0) [22413:101945] shim_gettimeofday -> 0 [22413:101945] shim_memset(0x8019bcba0, 0, 12509464) [22413:101945] shim_memset -> 0x8019bcba0 [22413:101945] shim_getenv("__NVML_CRAY_PSTATE") [22413:101945] shim_getenv -> 0x0 [22413:101945] shim_getenv("__NVIDIA_NVML_3373") [22413:101945] shim_getenv -> 0x0 [22413:101945] shim_getenv("__NVML_ONLY_DAEMON_PERSISTENCE_MODE") [22413:101945] shim_getenv -> 0x0 [22413:101945] shim_getenv("__RM_ENABLE_VERBOSE_OUTPUT") [22413:101945] shim_getenv -> 0x0 [22413:101945] shim_fopen("/proc/modules", "r") [22413:101945] shim_fopen -> 0x0 [22413:101945] shim___xstat(1, "/sys/bus/pci/devices", 0x7fffffffb3f0) [22413:101945] shim___xstat -> -1 [22413:101945] shim___errno_location() [22413:101945] shim___errno_location -> 0x80090e890 [22413:101945] shim_geteuid() [22413:101945] shim_geteuid -> 1001 [22413:101945] shim_getenv("__RM_ENABLE_VERBOSE_OUTPUT") [22413:101945] shim_getenv -> 0x0 [22413:101945] shim___xstat(1, "/usr/bin/nvidia-modprobe", 0x7fffffffb910) [22413:101945] shim___xstat -> -1 [22413:101945] shim_fopen("/proc/driver/nvidia/params", "r") [22413:101945] shim_fopen -> 0x800919f70 [22413:101945] shim___isoc99_fscanf(0x800919f70, "%31[^:]: %u ", ...) [22413:101945] shim___isoc99_fscanf -> 2 [22413:101945] shim___isoc99_fscanf(0x800919f70, "%31[^:]: %u ", ...) [22413:101945] shim___isoc99_fscanf -> 1 [22413:101945] shim_fclose(0x800919f70) [22413:101945] shim_fclose -> 0 [22413:101945] shim_snprintf(0x7fffffffb650, 128, "/dev/char/%d:%d", ...) [22413:101945] shim_snprintf -> 17 [22413:101945] shim___xstat(1, "/dev/nvidiactl", 0x7fffffffb7e0) [22413:101945] shim___xstat -> 0 [22413:101945] shim_snprintf(0x7fffffffb6d0, 128, "../%s", ...) [22413:101945] shim_snprintf -> 12 [22413:101945] shim_remove("/dev/char/195:255") Assertion failed: (!str_starts_with(path, "/dev/")), function shim_remove_impl, file src/libc/stdio.c, line 45. Abort trap (core dumped)
Thank you.
Fixed in c954193350e00937775d02eb1dac768d6b28c403. ModifyDeviceFiles: 0 alone is apparently not sufficient now.
ModifyDeviceFiles: 0
I get an assertion now when trying to run
nv-sglrun nvidia-smi
. This is usingnvidia-driver-525.116.03
It gives me this error:If I remove the assertion I get:
This is using the latest port I also tried downgrading to 20211220 and got the same error.
The full log is below when trying to run:
Thank you.