rbonghi / jetson_stats

📊 Simple package for monitoring and control your NVIDIA Jetson [Orin, Xavier, Nano, TX] series
https://rnext.it/jetson_stats
GNU Affero General Public License v3.0
2.14k stars 261 forks source link

Error Running on Nintendo Switch #102

Closed theofficialgman closed 3 years ago

theofficialgman commented 3 years ago

Describe the bug running jetson_config, jetson_release work as expected but I am getting a "Error Connection" when attempting to run jtop as sudo or as normal user. I am aware that this is an unsupported platform but all the nvidia libraries are available to have this run as far as I am aware.

Additional context $ jetson_release -v

--> Board

theofficialgman commented 3 years ago

jetson_stats is running as well as a service

theofficialgman commented 3 years ago

systemctl status jetson_stats.service ● jetson_stats.service - jetson_stats service Loaded: loaded (/etc/systemd/system/jetson_stats.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2020-10-29 16:56:07 EDT; 12min ago Main PID: 4295 (jtop) Tasks: 8 (limit: 4191) CGroup: /system.slice/jetson_stats.service ├─4295 /usr/bin/python3 /usr/local/bin/jtop service --force ├─6203 /usr/bin/python3 /usr/local/bin/jtop service --force └─6212 /usr/bin/python3 /usr/local/bin/jtop service --force

rbonghi commented 3 years ago

Hi @theofficialgman ,

This is the best test of jetson-stats outside a Jetson architecture :+1:

Have you tried to restart your Nintendo Switch?

theofficialgman commented 3 years ago

yeah restarted multiple times. same issue I can try ubuntu 20.04 with 32.4.3 to see if the issue is specific to bionic for some reason

rbonghi commented 3 years ago

If you can open multiple terminals, try to do:

first:

sudo systemctl stop jetson_stats.service
sudo jtop service

second:

jtop

Maybe we can catch other info

theofficialgman commented 3 years ago

ah also, my bionic is running 32.4.3 as well but it reports L4T 32.3.1 in your tool for some reason its built off of this as a base (https://developer.nvidia.com/embedded/L4T/r32_Release_v4.3/t210ref_release_aarch64/Tegra_Linux_Sample-Root-Filesystem_R32.4.3_aarch64.tbz2)

rbonghi commented 3 years ago

Ubuntu 20.04 is not officially released from NVIDIA, I never tried on my Jetson to update to 20.04. I don't know in this case. (Open a new issue for this problem)

theofficialgman commented 3 years ago

sudo jtop service [INFO] jtop.core.common - fan loaded on /sys/devices/pwm-fan [INFO] jtop.core.common - jetson_clocks loaded on /usr/bin/jetson_clocks [INFO] jtop.core.common - tegrastats loaded on /usr/bin/tegrastats [INFO] jtop.main - jetson_stats server loaded [INFO] jtop.core.fan - Mode set default status=False [INFO] jtop.service - tegrastats started 500ms [INFO] jtop.service - tegrastats close [INFO] jtop.service - jetson_clocks show closed

theofficialgman commented 3 years ago

all tests here are on bionic so far. I haven't tried focal yet. was just saying I could if needed to

theofficialgman commented 3 years ago

just to note, tegrastats does work as normal on the switch as well as nvpmodel

rbonghi commented 3 years ago

mmm ... I think python3 miss something to read the connection to the client. From the log that you sent me, I understand that jtop is correctly read tegrastats and jetson_clocks. I think is something missing in python3 that doesn't open the connection.

Try to install these packages: importlib_metadata and zipp-3.3.0 like #88

PS: I used all standard python libraries, but I don't know why for some distribution something is missing

rbonghi commented 3 years ago

Meanwhile, if we don't catch the problem. You can downgrade jetson-stats to 2.1.0. You will need to run jtop with sudo, but jtop in that version is a standalone python script.

theofficialgman commented 3 years ago

Well I went and installed importlib_metadata and zipp from their available pip packages (both of which didn't seem to have been installed before) and no difference.

then I uninstalled jetson-stats as directed in the wiki and then attempted to install 2.1.0 which failed during the install and left a bunch of files in its wake....

just an fyi, focal has the same issues even with its updated packages.

theofficialgman commented 3 years ago

so yeah, now I'm having trouble reinstalling anything

sudo -H pip3 install -U jetson-stats Collecting jetson-stats Installing collected packages: jetson-stats Exception: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 215, in main status = self.run(options, args) File "/usr/lib/python3/dist-packages/pip/commands/install.py", line 360, in run prefix=options.prefix_path, File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 784, in install **kwargs File "/usr/lib/python3/dist-packages/pip/req/req_install.py", line 851, in install self.move_wheel_files(self.source_dir, root=root, prefix=prefix) File "/usr/lib/python3/dist-packages/pip/req/req_install.py", line 1064, in move_wheel_files isolated=self.isolated, File "/usr/lib/python3/dist-packages/pip/wheel.py", line 377, in move_wheel_files clobber(source, dest, False, fixer=fixer, filter=filter) File "/usr/lib/python3/dist-packages/pip/wheel.py", line 323, in clobber shutil.copyfile(srcfile, destfile) File "/usr/lib/python3.6/shutil.py", line 121, in copyfile with open(dst, 'wb') as fdst: FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/jetson_stats/jetson_env.sh'

theofficialgman commented 3 years ago

Alright, now I'm back to where we started with the most recent jeston-stats 3.0.2 installed and the mess is fixed. What should be my procedure? should it be: jetson_config --uninstall sudo -H pip uninstall -y jetson-stats

restart

then download 2.1.0 source and run sudo -H pip install -e . from within that directory?

theofficialgman commented 3 years ago

@rbonghi alright I managed to get everything cleared out and then installed 2.1.0 using pip. sudo jtop produces errors on 2.1.0 (yes I rebooted)

sudo jtop
ERROR:jtop.core.tegrastats:Attribute error
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/jtop/core/tegrastats.py", line 64, in run
    self._stats = self._decode(tegrastats_data)
  File "/usr/local/lib/python3.6/dist-packages/jtop/core/tegrastats.py", line 124, in _decode
    stats = VALS(text)
  File "/usr/local/lib/python3.6/dist-packages/jtop/core/tegra_parse.py", line 153, in VALS
    vals[name] = val_freq(val)
  File "/usr/local/lib/python3.6/dist-packages/jtop/core/tegra_parse.py", line 35, in val_freq
    return {'val': int(match.group(1)), 'frq': int(match.group(2))}
AttributeError: 'NoneType' object has no attribute 'group'
^CTraceback (most recent call last):
  File "/usr/local/bin/jtop", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/jtop/__main__.py", line 90, in main
    with jtop(interval=args.refresh) as jetson:
  File "/usr/local/lib/python3.6/dist-packages/jtop/jtop.py", line 308, in __enter__
    self.open()
  File "/usr/local/lib/python3.6/dist-packages/jtop/jtop.py", line 251, in open
    self.tegrastats.open(self)
  File "/usr/local/lib/python3.6/dist-packages/jtop/core/tegrastats.py", line 95, in open
    while not self._stats:
KeyboardInterrupt
theofficialgman commented 3 years ago

@rbonghi any ideas?

theofficialgman commented 3 years ago

@rbonghi good news, I decided to randomly start up jetson stats again after not using it for a while and what do you know it works This is using version 2.1.0 as I already had previously installed. for whatever reason though, using sudo jtop in 2.1.0 causes it to crash, but non-sudo works fine (albeit without some stuff showing) I might not have tested without sudo all those weeks ago

sudo jtop
ERROR:jtop.core.tegrastats:Attribute error
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/jtop/core/tegrastats.py", line 64, in run
    self._stats = self._decode(tegrastats_data)
  File "/usr/local/lib/python3.6/dist-packages/jtop/core/tegrastats.py", line 124, in _decode
    stats = VALS(text)
  File "/usr/local/lib/python3.6/dist-packages/jtop/core/tegra_parse.py", line 153, in VALS
    vals[name] = val_freq(val)
  File "/usr/local/lib/python3.6/dist-packages/jtop/core/tegra_parse.py", line 35, in val_freq
    return {'val': int(match.group(1)), 'frq': int(match.group(2))}
AttributeError: 'NoneType' object has no attribute 'group'
^CTraceback (most recent call last):
  File "/usr/local/bin/jtop", line 11, in 
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/jtop/__main__.py", line 90, in main
    with jtop(interval=args.refresh) as jetson:
  File "/usr/local/lib/python3.6/dist-packages/jtop/jtop.py", line 308, in __enter__
    self.open()
  File "/usr/local/lib/python3.6/dist-packages/jtop/jtop.py", line 251, in open
    self.tegrastats.open(self)
  File "/usr/local/lib/python3.6/dist-packages/jtop/core/tegrastats.py", line 95, in open
    while not self._stats:
KeyboardInterrupt

the only thing that I think I installed since then that even relates to nvidia is the nvidia l4t multimedia api apt package

I'm off to test out the newest release and see if I have luck with that now

theofficialgman commented 3 years ago

newest jetson stats release (3.0.2) still has the same issue as original issue post

jetson_release -v
 - NVIDIA Jetson UNKNOWN
   * Jetpack 4.3 [L4T 32.3.1]
   * NV Power Mode: Locked_MAX - Type: 7
   * jetson_stats.service: active
 - Board info:
   * Type: UNKNOWN
   * SOC Family: tegra210 - ID:33
   * Module: UNKNOWN - Board: UNKNOWN
   * Code Name: icosa
   * Boardids: 2595:0000:A0
   * CUDA GPU architecture (ARCH_BIN): NONE
   * Serial Number: 1234
 - Libraries:
   * CUDA: NOT_INSTALLED
   * cuDNN: 7.6.3.28
   * TensorRT: 6.0.1.10
   * Visionworks: NOT_INSTALLED
   * OpenCV: 4.1.1 compiled CUDA: NO
   * VPI: NOT_INSTALLED
   * Vulkan: 1.1.70
 - jetson-stats:
   * Version 3.0.2
   * Works on Python 3.6.9
theofficialgman commented 3 years ago

alright so I've been able to figure out what was happening in the old version with sudo the nintendo switch doesn't report an emc utilization (so the output in tegrastats is just EMC_FREQ @1600) this was causing the code to fail to run I've modified it slightly to account for this on the old version and can get it to run with sudo

theofficialgman commented 3 years ago

I just want to say that I no longer run into any errors when running this we have fixed emc bandwidth calcs in our kernel so that small problem was solved

newest released jetson stats also works fine (though the python scripting isn't too efficient and causes a good amount of cpu loading)