rbonghi / jetson_stats

📊 Simple package for monitoring and control your NVIDIA Jetson [Orin, Xavier, Nano, TX] series
https://rnext.it/jetson_stats
GNU Affero General Public License v3.0
2.17k stars 264 forks source link

Crashes on Syslogic Rugged Jetson AGX Xavier #351

Closed vebjornjr closed 1 year ago

vebjornjr commented 1 year ago

https://www.syslogic.com/eng/ai-rugged-computer-jetson-agx-xavier-101557.shtml

Describe the bug Crashes on program start.

Traceback (most recent call last):
  File "/usr/local/bin/jtop", line 11, in <module>
    load_entry_point('jetson-stats==4.1.0', 'console_scripts', 'jtop')()
  File "/usr/local/lib/python3.6/dist-packages/jtop/__main__.py", line 151, in main
    loop=args.loop, seconds=LOOP_SECONDS, color_filter=color_filter)
  File "/usr/lib/python3.6/curses/__init__.py", line 94, in wrapper
    return func(stdscr, *args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/jtop/gui/jtopgui.py", line 114, in __init__
    self.run(loop, seconds)
  File "/usr/local/lib/python3.6/dist-packages/jtop/gui/jtopgui.py", line 143, in run
    self.draw()
  File "/usr/local/lib/python3.6/dist-packages/jtop/gui/lib/common.py", line 80, in wrapped
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/jtop/gui/jtopgui.py", line 158, in draw
    page.draw(self.key, self.mouse)
  File "/usr/local/lib/python3.6/dist-packages/jtop/gui/pall.py", line 137, in draw
    size_info = compact_info(self.stdscr, 0, line_counter + 1, column_width + 2, column_height, self.jetson)
  File "/usr/local/lib/python3.6/dist-packages/jtop/gui/lib/common.py", line 120, in wrapped
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/jtop/gui/jtopguimenu.py", line 169, in compact_info
    counter += compact_engines(stdscr, start, offset + counter, width, jetson)
  File "/usr/local/lib/python3.6/dist-packages/jtop/gui/pengine.py", line 85, in compact_engines
    map_eng = map_engines(jetson)
  File "/usr/local/lib/python3.6/dist-packages/jtop/gui/pengine.py", line 76, in map_engines
    return func_list_engines(jetson.engine)
  File "/usr/local/lib/python3.6/dist-packages/jtop/gui/pengine.py", line 41, in map_xavier
    [('APE', get_value_engine(engine['APE']['APE'])), ('CVNAS', get_value_engine(engine['CVNAS']['CVNAS']))],
KeyError: 'APE'

Additional context

Software part of jetson-stats 4.1.0 - (c) 2023, Raffaello Bonghi
Model: Jetson-AGX - Jetpack 4.6 [L4T 32.6.1]
NV Power Mode: MAXN - Type: 0
Hardware:
 - 699-level Part Number: 699-82888-0004-400 P.0
 - P-Number: p2888-0004
 - Module: NVIDIA Jetson AGX Xavier (32 GB ram)
 - SoC: tegra194
 - CUDA Arch BIN: 7.2
 - Codename: Galen
 - Serial Number: XXXXXXX
Platform:
 - Machine: aarch64
 - System: Linux
 - Distribution: Ubuntu 18.04 Bionic Beaver
 - Release: 4.9.253-tegra
 - Python: 3.6.9
jtop:
 - Version: 4.1.0
 - Service: Active
Libraries:
 - CUDA: 10.2.300
 - cuDNN: 8.2.1.32
 - TensorRT: 8.0.1.6
 - VPI: 1.1.15
 - OpenCV: 4.1.1 - with CUDA: NO

jtop-error.log

--------------------- PLATFORM -------------------------
Machine: aarch64
System: Linux
Distribution: Ubuntu 18.04 Bionic Beaver
Release: 4.9.253-tegra
Python: 3.6.9
-------------------- RAW OUTPUT ------------------------
------------------
/etc/nv_tegra_release:
# R32 (release), REVISION: 6.1, GCID: 27863751, BOARD: t186ref, EABI: aarch64, DATE: Mon Jul 26 19:36:31 UTC 2021
------------------
/sys/firmware/devicetree/base/model:
Jetson-AGX
------------------
/proc/device-tree/nvidia,boardids:
No such file or directory
------------------
/proc/device-tree/compatible:
nvidia,galen nvidia,jetson-xavier nvidia,p2822-0000+p2888-0001 nvidia,tegra194
------------------
/proc/device-tree/nvidia,dtsfilename:
arch/arm64/boot/dts/../../../../../../hardware/nvidia/platform/t19x/galen/kernel-dts/common/tegra194-p2888-0001-brla3-0000-common.dtsi
------------------
I2C-0-0x50:
01 00 FF 00 48 0B 04 00 04 50 00 00 00 00 00 00    ..ÿ.H....P......
00 00 00 00 36 39 39 2D 38 32 38 38 38 2D 30 30    ....699-82888-00
30 34 2D 34 30 30 20 50 2E 30 00 00 00 00 00 00    04-400 P.0......
00 00 FF FF FF FF FF FF FF FF FF FF FF FF FF FF    ..ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ
FF FF FF FF 65 65 4D 2D B0 48 31 34 32 34 36 32    ÿÿÿÿeeM-°H142462
31 30 33 36 34 36 38 00 00 00 00 33 22 05 01 00    1036468....3"...
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00 00 00 00 00 00 4E 56 43 42 1C 00 4D 31 00 00    ......NVCB..M1..
FF FF FF FF FF FF FF FF FF FF FF FF 65 65 4D 2D    ÿÿÿÿÿÿÿÿÿÿÿÿeeM-
B0 48 00 00 00 00 00 00 00 00 00 00 00 00 00 00    °H..............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00    ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 F6    ...............ö

------------------
I2C-0:
FAIL
------------------
I2C-1:
FAIL
------------------
I2C-2:
FAIL

Log from jtop 4.1.0
rbonghi commented 1 year ago

Hi @vebjornjr ! Thank you for your issue!

Can I ask if you can send me the output from:

journalctl -u jtop.service

And from:

sudo ls /sys/kernel/debug/clk

Looks like it's missing the APE engine, that I assumed was always available

vebjornjr commented 1 year ago

Thank you for taking a look. Let me know if you need anything else.

journalctl -u jtop.service

-- Logs begin at Mon 2023-01-30 13:24:15 CET, end at Mon 2023-01-30 13:28:09 CET. --
Jan 30 13:27:46 xavier-syslogic systemd[1]: Started jtop service.
Jan 30 13:27:46 xavier-syslogic jtop[15677]: [INFO] jtop.service - Running on Python: 3.6.9
Jan 30 13:27:46 xavier-syslogic jtop[15677]: [INFO] jtop.core.common - fan loaded on /sys/devices/platform
Jan 30 13:27:46 xavier-syslogic jtop[15677]: [WARNING] jtop.service - Fan is not available on this board in paths ['/sys/devices/platform']
Jan 30 13:27:46 xavier-syslogic jtop[15677]: [INFO] jtop.core.common - fan loaded on /sys/devices/pwm-fan
Jan 30 13:27:46 xavier-syslogic jtop[15677]: [INFO] jtop.core.common - jetson_clocks loaded on /usr/bin/jetson_clocks
Jan 30 13:27:47 xavier-syslogic jtop[15677]: [INFO] jtop.core.engine - Special Engine group found: [dlaX]
Jan 30 13:27:47 xavier-syslogic jtop[15677]: [INFO] jtop.core.engine - Special Engine group found: [pvaX]
Jan 30 13:27:47 xavier-syslogic jtop[15677]: [INFO] jtop.core.engine - Engines found: [CVNAS DLA0 DLA1 NVDEC NVENC NVJPG PVA0 PVA1 SE VIC]
Jan 30 13:27:47 xavier-syslogic jtop[15677]: [INFO] jtop.core.common - tegrastats loaded on /usr/bin/tegrastats
Jan 30 13:27:47 xavier-syslogic jtop[15677]: [INFO] jtop.__main__ - jetson_stats 4.1.0 - server loaded
Jan 30 13:27:48 xavier-syslogic jtop[15677]: [INFO] jtop.core.fan - Mode set default status=False
Jan 30 13:27:50 xavier-syslogic jtop[15677]: [INFO] jtop.service - tegrastats started 500ms
Jan 30 13:27:54 xavier-syslogic jtop[15677]: [INFO] jtop.service - tegrastats close
Jan 30 13:27:54 xavier-syslogic jtop[15677]: [INFO] jtop.service - jetson_clocks show closed

sudo ls /sys/kernel/debug/clk

32khz_out0       clk_orphan_summary  emc            gpcclk         i2c4      nafll_cvnas       nafll_nvjpg     nvcsi           nvdisplay_p3  pex0_core_4  pllc4_out2      pll_nvcsi       pva1_axi     sata             sor_safe           usb2_trk           xusb_core_mux
actmon           clk_summary         eqos_axi       gpu_pwr        i2c6      nafll_dla         nafll_pva_core  nvcsilp         nvenc         pex1_core_5  pllc4_vco_div2  pll_p           pva1_vps0    sata_oob         spi1               utmipll            xusb_core_ss
axi_cbb          cvnas               eqos_ptp_ref   hda            i2c7      nafll_dla_falcon  nafll_pva_vps   nvdec           nvenc1        pll_a        pll_d           pllp_div17      pva1_vps1    sdmmc1           tach               utmipll_clkout48   xusb_falcon
can1             dla0_core           eqos_rx        hda2codec_2x   i2c8      nafll_gpu         nafll_rce       nvdec1          nvjpg         pll_aon      pll_d2          pllp_out0       pwm1         sdmmc4           tsec               utmipll_clkout480  xusb_falcon_host
can2             dla0_falcon         eqos_rx_input  hda2hdmicodec  i2c9      nafll_isp         nafll_se        nvdisplay_disp  osc           plla_out0    pll_d3          pllp_out5       pwm4         sdmmc_legacy_tm  tsecb              vi                 xusb_falcon_ss
clk_32k          dla1_core           eqos_tx        host1x         isp       nafll_nvdec       nafll_tsec      nvdisplayhub    osc_div       pll_c        pll_d4          pllrefe_vcoout  pwm5         se               uarta              vic                xusb_fs
clk_dump         dla1_falcon         freq_stats_on  i2c1           kfuse     nafll_nvdec1      nafll_tsecb     nvdisplay_p0    pex0_core_0   pll_c4       pll_disphub     pva0_axi        pwm8         sor1_out         uartb              vi_const           xusb_fs_host
clk_m            dpaux1              fuse           i2c2           maud      nafll_nvenc       nafll_vi        nvdisplay_p1    pex0_core_1   pllc4_muxed  pll_dp          pva0_vps0       rce_cpu_nic  sor1_pad_clkout  uarte              xusb_core_dev      xusb_ss
clk_orphan_dump  dpaux3              fuse_serial    i2c3           mipi_cal  nafll_nvenc1      nafll_vic       nvdisplay_p2    pex0_core_3   pllc4_out1   pll_e           pva0_vps1       rce_nic      sor1_ref         uart_fst_mipi_cal  xusb_core_host     xusb_ss_superspeed
vebjornjr commented 1 year ago

By the way, it's correct that there is no fan on this unit.

rbonghi commented 1 year ago

interesting, yes the APE engine is not in this Jetson AGX.

Let me write a hotfix and release an update.

rbonghi commented 1 year ago

Can you send me a screnshot of the page ENG

jtop -p5
vebjornjr commented 1 year ago

image

rbonghi commented 1 year ago

Thank you very much! Helpful :-D BTW: I discover that not all Jetson AGX has an APE engine!

I will release an hotfix soon

rbonghi commented 1 year ago

Fixed

sudo pip3 install -U jetson-stats

If is fixed close this issue :-)

vebjornjr commented 1 year ago

That was fast. Seems to work now. Thank you!