Syllo / nvtop

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
Other
7.7k stars 282 forks source link

nvtop on NVIDIA Jetson Orin NX says "No GPU to monitor." #225

Closed ratsputin closed 11 months ago

ratsputin commented 11 months ago

Trying to get nvtop to run on NVIDIA Jetson Orin NX platform as the NVIDIA-recommended jtop utility is abysmal and crashes every 2-3 minutes. This is on Jetpack 5.1 [L4T 35.2.1].

When I build and run nvtop, I receive the following:

wyer@jetson:~/nvtop/build/src$ ./nvtop
No GPU to monitor.

Here's a log of the complete build and run:

wyer@jetson:~$ mkdir -p nvtop/build && cd nvtop/build
wyer@jetson:~/nvtop/build$ cmake .. -DNVIDIA_SUPPORT=ON -DAMDGPU_SUPPORT=OFF -DINTEL_SUPPORT=OFF
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Setting build type to 'Release' as none was specified.
-- Looking for cbreak in /usr/lib/aarch64-linux-gnu/libncursesw.so
-- Looking for cbreak in /usr/lib/aarch64-linux-gnu/libncursesw.so - found
-- Found Curses: /usr/lib/aarch64-linux-gnu/libncursesw.so
-- Performing Test HAS_REALLOCARRAY
-- Performing Test HAS_REALLOCARRAY - Success
-- Could NOT find UDev (missing: UDEV_LIBRARY UDEV_INCLUDE_DIR) (found version "")
-- Could NOT find Systemd (missing: SYSTEMD_LIBRARY SYSTEMD_INCLUDE_DIR) (found version "")
-- Found PkgConfig: /usr/bin/pkg-config (found version "0.29.1")
-- Found Libdrm: /usr/lib/aarch64-linux-gnu/libdrm.so (found version "2.4.107")
-- Found libdrm; Enabling support
-- Performing Test compiler_has-Wall
-- Performing Test compiler_has-Wall - Success
-- Performing Test compiler_has-Wextra
-- Performing Test compiler_has-Wextra - Success
-- Performing Test compiler_has-Waddress
-- Performing Test compiler_has-Waddress - Success
-- Performing Test compiler_has-Waggressive-loop-optimizations
-- Performing Test compiler_has-Waggressive-loop-optimizations - Success
-- Performing Test compiler_has-Wbad-function-cast
-- Performing Test compiler_has-Wbad-function-cast - Success
-- Performing Test compiler_has-Wmissing-declarations
-- Performing Test compiler_has-Wmissing-declarations - Success
-- Performing Test compiler_has-Wmissing-parameter-type
-- Performing Test compiler_has-Wmissing-parameter-type - Success
-- Performing Test compiler_has-Wmissing-prototypes
-- Performing Test compiler_has-Wmissing-prototypes - Success
-- Performing Test compiler_has-Wnested-externs
-- Performing Test compiler_has-Wnested-externs - Success
-- Performing Test compiler_has-Wold-style-declaration
-- Performing Test compiler_has-Wold-style-declaration - Success
-- Performing Test compiler_has-Wold-style-definition
-- Performing Test compiler_has-Wold-style-definition - Success
-- Performing Test compiler_has-Wstrict-prototypes
-- Performing Test compiler_has-Wstrict-prototypes - Success
-- Performing Test compiler_has-Wpointer-sign
-- Performing Test compiler_has-Wpointer-sign - Success
-- Performing Test compiler_has-Wdouble-promotion
-- Performing Test compiler_has-Wdouble-promotion - Success
-- Performing Test compiler_has-Wuninitialized
-- Performing Test compiler_has-Wuninitialized - Success
-- Performing Test compiler_has-Winit-self
-- Performing Test compiler_has-Winit-self - Success
-- Performing Test compiler_has-Wstrict-aliasing
-- Performing Test compiler_has-Wstrict-aliasing - Success
-- Performing Test compiler_has-Wsuggest-attribute-const
-- Performing Test compiler_has-Wsuggest-attribute-const - Success
-- Performing Test compiler_has-Wtrampolines
-- Performing Test compiler_has-Wtrampolines - Success
-- Performing Test compiler_has-Wfloat-equal
-- Performing Test compiler_has-Wfloat-equal - Success
-- Performing Test compiler_has-Wshadow
-- Performing Test compiler_has-Wshadow - Success
-- Performing Test compiler_has-Wunsafe-loop-optimizations
-- Performing Test compiler_has-Wunsafe-loop-optimizations - Success
-- Performing Test compiler_has-Wfloat-conversion
-- Performing Test compiler_has-Wfloat-conversion - Success
-- Performing Test compiler_has-Wlogical-op
-- Performing Test compiler_has-Wlogical-op - Success
-- Performing Test compiler_has-Wnormalized
-- Performing Test compiler_has-Wnormalized - Success
-- Performing Test compiler_has-Wdisabled-optimization
-- Performing Test compiler_has-Wdisabled-optimization - Success
-- Performing Test compiler_has-Whsa
-- Performing Test compiler_has-Whsa - Success
-- Performing Test compiler_has-Wunused-result
-- Performing Test compiler_has-Wunused-result - Success
-- Performing Test compiler_has-Werror-implicit-function-declaration
-- Performing Test compiler_has-Werror-implicit-function-declaration - Success
-- Performing Test compiler_has-Wformat
-- Performing Test compiler_has-Wformat - Success
-- Performing Test compiler_has-Wformat-security
-- Performing Test compiler_has-Wformat-security - Success
-- Performing Test linker_has-Wl_-z_relro
-- Performing Test linker_has-Wl_-z_relro - Success
-- Could NOT find GTest (missing: GTEST_LIBRARY GTEST_INCLUDE_DIR GTEST_MAIN_LIBRARY)
-- Configuring done
-- Generating done
-- Build files have been written to: /home/wyer/nvtop/build
wyer@jetson:~/nvtop/build$ make
Scanning dependencies of target nvtop
[  5%] Building C object src/CMakeFiles/nvtop.dir/nvtop.c.o
[ 11%] Building C object src/CMakeFiles/nvtop.dir/interface.c.o
[ 17%] Building C object src/CMakeFiles/nvtop.dir/interface_layout_selection.c.o
[ 23%] Building C object src/CMakeFiles/nvtop.dir/interface_options.c.o
[ 29%] Building C object src/CMakeFiles/nvtop.dir/interface_setup_win.c.o
[ 35%] Building C object src/CMakeFiles/nvtop.dir/interface_ring_buffer.c.o
[ 41%] Building C object src/CMakeFiles/nvtop.dir/get_process_info_linux.c.o
[ 47%] Building C object src/CMakeFiles/nvtop.dir/extract_gpuinfo.c.o
[ 52%] Building C object src/CMakeFiles/nvtop.dir/extract_processinfo_fdinfo.c.o
[ 58%] Building C object src/CMakeFiles/nvtop.dir/time.c.o
[ 64%] Building C object src/CMakeFiles/nvtop.dir/plot.c.o
[ 70%] Building C object src/CMakeFiles/nvtop.dir/ini.c.o
[ 76%] Building C object src/CMakeFiles/nvtop.dir/info_messages_linux.c.o
[ 82%] Building C object src/CMakeFiles/nvtop.dir/extract_gpuinfo_nvidia.c.o
[ 88%] Building C object src/CMakeFiles/nvtop.dir/extract_gpuinfo_msm.c.o
[ 94%] Building C object src/CMakeFiles/nvtop.dir/extract_gpuinfo_msm_utils.c.o
[100%] Linking C executable nvtop
[100%] Built target nvtop
wyer@jetson:~/nvtop/build$ src/nvtop
No GPU to monitor.

Any suggestions? That error message doesn't give me much to go on.

Syllo commented 11 months ago

Could you try using LD_LIBRARY_PATH to specify the location to libnvidia-ml.so? See my message in #150

ratsputin commented 11 months ago

Hmm... This is odd:

wyer@jetson:~/nvtop/build/src$ LD_LIBRARY_PATH=/usr/local/cuda-11.4/targets/aarch64-linux/lib/stubs/ ./nvtop

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
WARNING:

You should always run with libnvidia-ml.so that is installed with your
NVIDIA Display Driver. By default it's installed in /usr/lib and /usr/lib64.
libnvidia-ml.so in GDK package is a stub library that is attached only for
build purposes (e.g. machine that you build your application doesn't have
to have Display Driver installed).
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
lsof: WARNING: can't stat() overlay file system /var/lib/docker/overlay2/afcdfcfb4534acdd7367e1bcc31f9a9c5870e0a7b573eb74790acd0103dbc8fa/merged
      Output information may be incomplete.
lsof: WARNING: can't stat() nsfs file system /run/docker/netns/c44ec58f90c3
      Output information may be incomplete.
Linked to libnvidia-ml library at wrong path : /usr/local/cuda-11.4/targets/aarch64-linux/lib/stubs/libnvidia-ml.so

No GPU to monitor.

Where I'm seeing this:

root@jetson:~# find / -type f -name libnvidia-ml\* -ls
find: ‘/run/user/1000/gvfs’: Permission denied
 16126706     44 -rw-r--r--   1 root     root        42720 Sep 14  2022 /var/lib/docker/overlay2/wnl288uwknio4g7vxyoysy90f/diff/usr/local/cuda-11.4/targets/aarch64-linux/lib/stubs/libnvidia-ml.so
   661475     44 -rw-r--r--   1 root     root        42720 Sep 14  2022 /usr/local/cuda-11.4/targets/aarch64-linux/lib/stubs/libnvidia-ml.so
root@jetson:~#

Apparently, the required library isn't installed on Jetpack 5.1 [L4T 35.2.1]?

ratsputin commented 11 months ago

Based on what I'm reading, NVML is not supported on the Jetson. Looks like this is a fool's errand.