rbonghi / jetson_stats

📊 Simple package for monitoring and control your NVIDIA Jetson [Orin, Xavier, Nano, TX] series
https://rnext.it/jetson_stats
GNU Affero General Public License v3.0
2.17k stars 264 forks source link

[Jetson AGX Orin 64 GB] jetson stats not starting - Error with GUI colors #428

Closed laminair closed 1 year ago

laminair commented 1 year ago

Describe the bug

I just installed jetson_stats on a bunch of Jetson AGX Orin 64 GB devices. Tried installation via pip and from source (GIT repo). Launched the systemd service without error but cannot access the GUI.

To Reproduce

Installation via pip

sudo -H pip3 install --no-cache-dir -U jetson-stats

Installation from source

git clone https://github.com/rbonghi/jetson_stats.git
sudo python3 setup.py build
sudo python3 setup.py install

Steps to reproduce the behavior:

Run jtop or sudo jtop. Both produces the same error

Screenshots / Terminal Logs

Running jtop:

ubuntu@orin-02:~/jetson_stats$ jtop

Traceback (most recent call last):
  File "/usr/local/bin/jtop", line 11, in <module>
    load_entry_point('jetson-stats==4.2.2', 'console_scripts', 'jtop')()
  File "/usr/local/lib/python3.8/dist-packages/jtop/__main__.py", line 159, in main
    curses.wrapper(JTOPGUI, jetson, pages, init_page=args.page,
  File "/usr/lib/python3.8/curses/__init__.py", line 105, in wrapper
    return func(stdscr, *args, **kwds)
  File "/usr/local/lib/python3.8/dist-packages/jtop/gui/jtopgui.py", line 79, in __init__
    NColors(color_filter)
  File "/usr/local/lib/python3.8/dist-packages/jtop/gui/lib/colors.py", line 43, in __init__
    curses.init_pair(NColors.RED, curses.COLOR_RED if not color_filter else curses.COLOR_BLUE, curses.COLOR_BLACK)
_curses.error: init_pair() returned ERR

Running sudo jtop --health:

ubuntu@orin-02:~/jetson_stats$ sudo jtop --health
Traceback (most recent call last):
  File "/usr/local/bin/jtop", line 11, in <module>
    load_entry_point('jetson-stats==4.2.2', 'console_scripts', 'jtop')()
  File "/usr/local/lib/python3.8/dist-packages/jtop/__main__.py", line 134, in main
    jtop_config()
  File "/usr/local/lib/python3.8/dist-packages/jtop/jetson_config.py", line 227, in jtop_config
    curses.wrapper(JTOPCONFIG, JTOP_MENU)
  File "/usr/lib/python3.8/curses/__init__.py", line 105, in wrapper
    return func(stdscr, *args, **kwds)
  File "/usr/local/lib/python3.8/dist-packages/jtop/gui/jtopguiconfig.py", line 39, in __init__
    curses.init_pair(1, curses.COLOR_RED, curses.COLOR_BLACK)
_curses.error: init_pair() returned ERR

Expected behavior

When running jtop, the jetson_stats GUI becomes available.

Additional context

From the logs, the jetson_stats service seems to be running just fine. I already set the power profile to max power, as suggested by #406.

Board

ubuntu@orin-02:~$ jetson_release -v
Software part of jetson-stats 4.2.2 - (c) 2023, Raffaello Bonghi
Model: Jetson AGX Orin - Jetpack 5.1 [L4T 35.2.1]
NV Power Mode[0]: MAXN
Serial Number: [XXX Show with: jetson_release -s XXX]
Hardware:
 - 699-level Part Number: XXXXXXXXXXXXX
 - P-Number: XXXXXXXXXXXXX
 - Module: NVIDIA Jetson AGX Orin (64GB ram)
 - SoC: tegra23x
 - CUDA Arch BIN: 8.7
 - Codename: Concord
Platform:
 - Machine: aarch64
 - System: Linux
 - Distribution: Ubuntu 20.04 focal
 - Release: 5.10.104-tegra
 - Python: 3.8.10
jtop:
 - Version: 4.2.2
 - Service: Active
Libraries:
 - CUDA: 11.4.315
 - cuDNN: 8.6.0.166
 - TensorRT: 5.1
 - VPI: 2.2.4
 - Vulkan: 1.3.204
 - OpenCV: 4.5.4 - with CUDA: NO

Log from jtop.service

ubuntu@orin-02:~$ journalctl -u jtop.service -n 100 --no-pager
-- Logs begin at Mon 2023-03-27 19:54:06 CEST, end at Mon 2023-07-03 14:21:01 CEST. --
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.processes - Process service started
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.memory - Found EMC!
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.memory - Memory service started
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.engine - Special Engine group found: [dlaX]
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.engine - Special Engine group found: [pvaX]
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.engine - Engines found: [APE DLA0 DLA1 NVDEC NVENC NVJPG PVA0 SE VIC]
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "CV0" in thermal_zone2
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "CPU" in thermal_zone0
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "Tboard" in thermal_zone9
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "SOC2" in thermal_zone7
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "Tdiode" in thermal_zone10
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "SOC0" in thermal_zone5
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "CV1" in thermal_zone3
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "GPU" in thermal_zone1
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "tj" in thermal_zone8
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "SOC1" in thermal_zone6
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.temperature - Found thermal "CV2" in thermal_zone4
Mär 27 19:54:12 orin-02 jtop[1896]: [WARNING] jtop.core.power - Skipped NC /sys/bus/i2c/devices/1-0041/hwmon/hwmon4/in1_label
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.power - Alarms VDDQ_VDD2_1V8AO - {'crit_alarm': 0, 'max_alarm': 0}
Mär 27 19:54:12 orin-02 jtop[1896]: [WARNING] jtop.core.power - Skipped NC /sys/bus/i2c/devices/1-0041/hwmon/hwmon4/in3_label
Mär 27 19:54:12 orin-02 jtop[1896]: [WARNING] jtop.core.power - Skipped "sum of shunt voltages" /sys/bus/i2c/devices/1-0041/hwmon/hwmon4/in7_label
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.power - Alarms VDD_GPU_SOC - {'crit_alarm': 0, 'max_alarm': 0}
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.power - Alarms VDD_CPU_CV - {'crit_alarm': 0, 'max_alarm': 0}
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.power - Alarms VIN_SYS_5V0 - {'crit_alarm': 0, 'max_alarm': 0}
Mär 27 19:54:12 orin-02 jtop[1896]: [WARNING] jtop.core.power - Skipped "sum of shunt voltages" /sys/bus/i2c/devices/1-0040/hwmon/hwmon3/in7_label
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.power - Found I2C power monitor
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.power - Found name=1-00081 type=USB model=<EMPTY> in ucsi-source-psy-1-00081
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.power - Found SYSTEM power monitor
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.fan - Fan pwmfan(1) found in /sys/class/hwmon/hwmon2
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.fan - RPM pwm_tach found in /sys/class/hwmon/hwmon0
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.fan - Found nvfancontrol.service
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.jetson_clocks - jetson_clocks found in /usr/bin/jetson_clocks
Mär 27 19:54:12 orin-02 jtop[1896]: [INFO] jtop.core.nvpmodel - nvpmodel running in [0]MAXN - Default: 2
Mär 27 19:54:12 orin-02 jtop[2011]: [INFO] jtop.service - Initialization service
Mär 27 19:54:14 orin-02 jtop[2011]: [INFO] jtop.service - service ready
Jul 03 14:08:58 orin-02 jtop[2011]: [INFO] jtop.service - jtop timer thread started 1000ms
Jul 03 14:09:02 orin-02 jtop[2011]: [INFO] jtop.service - jtop timer thread close
Jul 03 14:09:05 orin-02 jtop[2011]: [INFO] jtop.service - jtop timer thread started 1000ms
Jul 03 14:09:09 orin-02 jtop[2011]: [INFO] jtop.service - jtop timer thread close
Jul 03 14:09:54 orin-02 jtop[2011]: [INFO] jtop.service - jtop timer thread started 1000ms
Jul 03 14:10:00 orin-02 jtop[2011]: [INFO] jtop.service - jtop timer thread close
Jul 03 14:12:02 orin-02 systemd[1]: Stopping jtop service...
Jul 03 14:12:02 orin-02 jtop[1994]: [INFO] jtop.__main__ - Close service by signal 15
Jul 03 14:12:02 orin-02 jtop[1896]: [INFO] jtop.__main__ - Close service by signal 15
Jul 03 14:12:02 orin-02 jtop[2011]: [INFO] jtop.__main__ - Close service by signal 15
Jul 03 14:12:02 orin-02 jtop[2011]: [WARNING] jtop.service - KeyboardInterrupt, SystemExit interrupt
Jul 03 14:12:02 orin-02 jtop[2011]: [INFO] jtop.service - FORCE jtop timer thread close
Jul 03 14:12:02 orin-02 jtop[1896]: [INFO] jtop.service - Terminate subprocess
Jul 03 14:12:02 orin-02 jtop[1896]: [INFO] jtop.service - Wait shutdown subprocess
Jul 03 14:12:03 orin-02 jtop[1896]: [INFO] jtop.service - Service closed
Jul 03 14:12:03 orin-02 systemd[1]: jtop.service: Succeeded.
Jul 03 14:12:03 orin-02 systemd[1]: Stopped jtop service.
Jul 03 14:12:04 orin-02 systemd[1]: Started jtop service.
Jul 03 14:12:04 orin-02 jtop[3569]: [INFO] jtop.service - jetson_stats 4.2.2 - server loaded
Jul 03 14:12:04 orin-02 jtop[3569]: [INFO] jtop.core.hardware - Hardware detected aarch64
Jul 03 14:12:04 orin-02 jtop[3569]: [INFO] jtop.core.hardware - NVIDIA Jetson detected L4T=35.2.1
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.service - Running on Python: 3.8.10
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.cpu - Found 12 CPU
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.gpu - GPU "ga10b" status in /sys/devices/platform/17000000.ga10b
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.gpu - GPU "ga10b" frq in /sys/devices/platform/17000000.ga10b/devfreq/17000000.ga10b
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.processes - Process service started
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.memory - Found EMC!
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.memory - Memory service started
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.engine - Special Engine group found: [dlaX]
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.engine - Special Engine group found: [pvaX]
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.engine - Engines found: [APE DLA0 DLA1 NVDEC NVENC NVJPG PVA0 SE VIC]
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "CV0" in thermal_zone2
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "CPU" in thermal_zone0
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "Tboard" in thermal_zone9
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "SOC2" in thermal_zone7
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "Tdiode" in thermal_zone10
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "SOC0" in thermal_zone5
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "CV1" in thermal_zone3
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "GPU" in thermal_zone1
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "tj" in thermal_zone8
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "SOC1" in thermal_zone6
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.temperature - Found thermal "CV2" in thermal_zone4
Jul 03 14:12:05 orin-02 jtop[3569]: [WARNING] jtop.core.power - Skipped NC /sys/bus/i2c/devices/1-0041/hwmon/hwmon4/in1_label
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.power - Alarms VDDQ_VDD2_1V8AO - {'crit_alarm': 0, 'max_alarm': 0}
Jul 03 14:12:05 orin-02 jtop[3569]: [WARNING] jtop.core.power - Skipped NC /sys/bus/i2c/devices/1-0041/hwmon/hwmon4/in3_label
Jul 03 14:12:05 orin-02 jtop[3569]: [WARNING] jtop.core.power - Skipped "sum of shunt voltages" /sys/bus/i2c/devices/1-0041/hwmon/hwmon4/in7_label
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.power - Alarms VDD_GPU_SOC - {'crit_alarm': 0, 'max_alarm': 0}
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.power - Alarms VDD_CPU_CV - {'crit_alarm': 0, 'max_alarm': 0}
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.power - Alarms VIN_SYS_5V0 - {'crit_alarm': 0, 'max_alarm': 0}
Jul 03 14:12:05 orin-02 jtop[3569]: [WARNING] jtop.core.power - Skipped "sum of shunt voltages" /sys/bus/i2c/devices/1-0040/hwmon/hwmon3/in7_label
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.power - Found I2C power monitor
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.power - Found name=1-00081 type=USB model=<EMPTY> in ucsi-source-psy-1-00081
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.power - Found name=1-00082 type=USB model=<EMPTY> in ucsi-source-psy-1-00082
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.power - Found SYSTEM power monitor
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.fan - Fan pwmfan(1) found in /sys/class/hwmon/hwmon2
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.fan - RPM pwm_tach found in /sys/class/hwmon/hwmon0
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.fan - Found nvfancontrol.service
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.jetson_clocks - jetson_clocks found in /usr/bin/jetson_clocks
Jul 03 14:12:05 orin-02 jtop[3569]: [INFO] jtop.core.nvpmodel - nvpmodel running in [0]MAXN - Default: 2
Jul 03 14:12:05 orin-02 jtop[3588]: [INFO] jtop.service - Initialization service
Jul 03 14:12:06 orin-02 jtop[3588]: [INFO] jtop.service - service ready
Jul 03 14:12:07 orin-02 jtop[3588]: [INFO] jtop.service - jtop timer thread started 1000ms
Jul 03 14:12:11 orin-02 jtop[3588]: [INFO] jtop.service - jtop timer thread close
Jul 03 14:17:17 orin-02 jtop[3588]: [INFO] jtop.service - jtop timer thread started 1000ms
Jul 03 14:17:21 orin-02 jtop[3588]: [INFO] jtop.service - jtop timer thread close

Log from jetson-stats installation

Requirement already up-to-date: jetson-stats in /usr/local/lib/python3.8/dist-packages (4.2.2)
Requirement already satisfied, skipping upgrade: distro in /usr/lib/python3/dist-packages (from jetson-stats) (1.4.0)
Requirement already satisfied, skipping upgrade: smbus2 in /usr/local/lib/python3.8/dist-packages (from jetson-stats) (0.4.2)
NguyenKhanh27 commented 1 year ago

I saw no one support on this so I did compare the system service changes from v4.1 -> 4.2 is that they add "After=multi-user.target", tbh I'm not familiar with system service but in my case, I rm that line in jtop system service and then it works for me. Maybe you can have a try

laminair commented 1 year ago

@NguyenKhanh27 Thanks for the suggestion! That did the job but it seems odd that there is a conflict between multi-user.target (which is the user console/GUI being available) and the Jtop GUI (colors, to be specific).

Anyway, it works. Thanks.