shkhln / libc6-shim

Cheap glibc knockoff
MIT License
31 stars 5 forks source link

Failed to initialize NVML: GPU access blocked by the operating system #9

Closed verm closed 1 year ago

verm commented 1 year ago

The assertion was fixed in in c954193 but now I get the error in the title.

I've been reading up on what causes this error but it doesn't seem straight forward I've of course tried this as root.

I'll try downgrading the driver version later today I'm running 525.105.17 on 13.2 with a RTX 3060.

Thanks.

shkhln commented 1 year ago

Log? I didn't just amend the assertion check, I also verified that the whole thing works with 525.

verm commented 1 year ago

Oh shoot I forgot to add it here it is:

# SHIM_DEBUG=1  nv-sglrun nvidia-smi
shim init
[2525:112432] shim_getpid()
[2525:112432] shim_getpid -> 2525
[2525:112432] shim_getenv("__NVML_DBG_LVL")
[2525:112432] shim_getenv -> 0x0
[2525:112432] shim_getenv("__NVML_DBG_APPEND")
[2525:112432] shim_getenv -> 0x0
[2525:112432] shim_getenv("__NVML_DBG_FILE")
[2525:112432] shim_getenv -> 0x0
[2525:112432] shim_gettimeofday(0x8025b4410, 0x0)
[2525:112432] shim_gettimeofday -> 0
[2525:112432] shim_memset(0x8019bcba0, 0, 12509464)
[2525:112432] shim_memset -> 0x8019bcba0
[2525:112432] shim_getenv("__NVML_CRAY_PSTATE")
[2525:112432] shim_getenv -> 0x0
[2525:112432] shim_getenv("__NVIDIA_NVML_3373")
[2525:112432] shim_getenv -> 0x0
[2525:112432] shim_getenv("__NVML_ONLY_DAEMON_PERSISTENCE_MODE")
[2525:112432] shim_getenv -> 0x0
[2525:112432] shim_getenv("__RM_ENABLE_VERBOSE_OUTPUT")
[2525:112432] shim_getenv -> 0x0
[2525:112432] shim_fopen("/proc/modules", "r")
[2525:112432] shim_fopen -> 0x0
[2525:112432] shim___xstat(1, "/sys/bus/pci/devices", 0x7fffffffb400)
[2525:112432] shim___xstat -> -1
[2525:112432] shim___errno_location()
[2525:112432] shim___errno_location -> 0x80090e890
[2525:112432] shim_geteuid()
[2525:112432] shim_geteuid -> 0
[2525:112432] shim_fopen("/proc/sys/kernel/modprobe", "r")
[2525:112432] shim_fopen -> 0x0
[2525:112432] shim___xstat(1, "/sbin/modprobe", 0x7fffffffb500)
[2525:112432] shim___xstat -> -1
[2525:112432] shim_getenv("__RM_ENABLE_VERBOSE_OUTPUT")
[2525:112432] shim_getenv -> 0x0
[2525:112432] shim___xstat(1, "/usr/bin/nvidia-modprobe", 0x7fffffffb920)
[2525:112432] shim___xstat -> -1
[2525:112432] shim_fopen("/proc/driver/nvidia/params", "r")
[2525:112432] shim_fopen -> 0x800919f70
[2525:112432] shim___isoc99_fscanf(0x800919f70, "%31[^:]: %u
", ...)
[2525:112432] shim___isoc99_fscanf -> 2
[2525:112432] shim___isoc99_fscanf(0x800919f70, "%31[^:]: %u
", ...)
[2525:112432] shim___isoc99_fscanf -> 1
[2525:112432] shim_fclose(0x800919f70)
[2525:112432] shim_fclose -> 0
[2525:112432] shim_snprintf(0x7fffffffb660, 128, "/dev/char/%d:%d", ...)
[2525:112432] shim_snprintf -> 17
[2525:112432] shim___xstat(1, "/dev/nvidiactl", 0x7fffffffb7f0)
[2525:112432] shim___xstat -> 0
[2525:112432] shim_snprintf(0x7fffffffb6e0, 128, "../%s", ...)
[2525:112432] shim_snprintf -> 12
[2525:112432] shim_remove("/dev/char/195:255")
[2525:112432] shim_remove -> -1
[2525:112432] shim_symlink("../nvidiactl", "/dev/char/195:255")
[2525:112432] shim_symlink -> -1
[2525:112432] shim___xstat(1, "/dev/char/195:255", 0x7fffffffb760)
[2525:112432] shim___xstat -> -1
[2525:112432] shim___errno_location()
[2525:112432] shim___errno_location -> 0x80090e890
[2525:112432] shim_snprintf(0x7fffffffb990, 32, "-c=%d", ...)
[2525:112432] shim_snprintf -> 6
[2525:112432] shim_getenv("__RM_ENABLE_VERBOSE_OUTPUT")
[2525:112432] shim_getenv -> 0x0
[2525:112432] shim___xstat(1, "/usr/bin/nvidia-modprobe", 0x7fffffffb8e0)
[2525:112432] shim___xstat -> -1
[2525:112432] shim_fopen("/proc/driver/nvidia/params", "r")
[2525:112432] shim_fopen -> 0x800919f70
[2525:112432] shim___isoc99_fscanf(0x800919f70, "%31[^:]: %u
", ...)
[2525:112432] shim___isoc99_fscanf -> 2
[2525:112432] shim___isoc99_fscanf(0x800919f70, "%31[^:]: %u
", ...)
[2525:112432] shim___isoc99_fscanf -> 1
[2525:112432] shim_fclose(0x800919f70)
[2525:112432] shim_fclose -> 0
[2525:112432] shim___xstat(1, "/dev/nvidiactl", 0x7fffffffb820)
[2525:112432] shim___xstat -> 0
[2525:112432] shim_getenv("__RM_ENABLE_VERBOSE_OUTPUT")
[2525:112432] shim_getenv -> 0x0
[2525:112432] shim_fopen("/dev/nvidiactl", "r")
[2525:112432] shim_fopen -> 0x800919f70
[2525:112432] shim_fclose(0x800919f70)
[2525:112432] shim_fclose -> 0
Failed to initialize NVML: GPU access blocked by the operating system
[2525:112432] shim___cxa_finalize(0x8019bc600)
[2525:112432] shim___cxa_finalize -> void
verm commented 1 year ago

I did an update from 13.1 to 13.2 using freebsd-update. I also reinstalled all my ports I did use ktrace and see if it was picking up any dangling libraries but I couldn't find any. I also don't have anything special set in sysctl.conf.

verm commented 1 year ago

Okay sorry for the noise I have no idea what happened decided to wipe all the nvidia libraries manually remove the ports, removed libc6 and reinstalled it not works. So something must have been either dangling or I messed up the original install from source, strange.