rbonghi / jetson_stats

📊 Simple package for monitoring and control your NVIDIA Jetson [Orin, Xavier, Nano, TX] series
https://rnext.it/jetson_stats
GNU Affero General Public License v3.0
2.14k stars 261 forks source link

Error when using jtop in docker container #63

Closed yamiefun closed 1 year ago

yamiefun commented 4 years ago

Hi, I'm using a nvidia docker container on Jetson Xavier. The docker image is nvcr.io/nvidia/l4t-base:r32.3.1.

I install jtop with pip3 install -U jetson-stats, but it will pop up some errors while executing. Here are the error messages.

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/jtop/jtop.py", line 119, in init self.jc = JetsonClocks(config_file) File "/usr/local/lib/python3.6/dist-packages/jtop/core/jetson_clocks.py", line 62, in init raise JetsonClocks.JCException("No jetson_clock script is availble in this board") jtop.core.jetson_clocks.JCException: No jetson_clock script is availble in this board

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/jtop/main.py", line 90, in main with jtop(interval=args.refresh) as jetson: File "/usr/local/lib/python3.6/dist-packages/jtop/jtop.py", line 121, in init raise jtop.JtopException(e) jtop.jtop.JtopException: No jetson_clock script is availble in this board

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/usr/local/bin/jtop", line 11, in sys.exit(main()) File "/usr/local/lib/python3.6/dist-packages/jtop/main.py", line 136, in main print("[{status}] {error}".format(status=bcolors.fail(), error=e.message)) AttributeError: 'JtopException' object has no attribute 'message'

rbonghi commented 4 years ago

Hi @yamiefun

Thank you to use jetson-stats. This is not a bug, but a know enhancement. At this time jetson-stats cannot work in a container for different reasons that you can directly read from the error you posted.

There is a significant error:

No jetson_clock script is available in this board

In this case jtop try to find in your container jetson_clocks and obliviously this is not available in your container, but only in your host.

I never tried jtop in a container, but you should share some volumes and check if jtop start to read something, in particular:

Maybe your container can read something. Keep me posted

yamiefun commented 4 years ago

Hi @rbonghi

Thanks for helping. Actually I'm finding a way to check if my docker container can access GPU on Jetson Xavier. Now I can use tegrastats in my container if I share /usr/bin/tegrastats by a volume. I think jtop will work if all necessary files are shared.

Thanks very much!

rbonghi commented 4 years ago

Don't worry!

I'm in the same boat. I'm porting my robot software in docker, and I'm find a nice way to use jetson-stats inside a container. This docker implementation is a working progress.

yamiefun commented 4 years ago

By the way, do you use cuDNN, CUDA and opencv in your robot container? I'm trying to run application with them but I haven't figured out how to do it yet. I'm not sure if it's reasonable to share them with volume.

rbonghi commented 4 years ago

At this time, I'm only approaching the problem.

I suggest to open a post on nvidia forum I think they can help you more to setup a docker image. I will do the same soon :-)

rbonghi commented 4 years ago

I completely reshape jetson-stats

from the new version 3.0 will be simple to integrate it in a docker container, before my official version I will release a simple guideline to use inside a docker container. When I will release my docker image I will close this issue

assafzam commented 4 years ago

Hi @rbonghi , there is any new with the guideline to use jetson-stats inside a docker container?

muzammil360 commented 4 years ago

@rbonghi I am also looking forward to use jtop python api inside container. I have mapped tegrastats, nvpmodel and jetson_clocks as pointed out above. They are also accessible inside container. However jtop is still looking for jetson_stats.service.

>>> from jtop import jtop
>>> with jtop() as jetson:
...     # jetson.ok() will provide the proper update frequency
...     while jetson.ok():
...         # Read tegra stats
...         print(jetson.stats)
... 
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/jtop/jtop.py", line 927, in start
    self._broadcaster.connect()
  File "/usr/local/lib/python3.6/multiprocessing/managers.py", line 489, in connect
    conn = Client(self._address, authkey=self._authkey)
  File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 487, in Client
    c = SocketClient(address)
  File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 614, in SocketClient
    s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/jtop/jtop.py", line 1081, in __enter__
    self.start()
  File "/usr/local/lib/python3.6/site-packages/jtop/jtop.py", line 930, in start
    raise JtopException("The jetson_stats.service is not active. Please run:\nsudo systemctl restart jetson_stats.service")
jtop.core.exceptions.JtopException: The jetson_stats.service is not active. Please run:
sudo systemctl restart jetson_stats.service

How can I run this code inside container to access jetson stats

rbonghi commented 4 years ago

Hi @assafzam and @muzammil360 ,

WOW! You really push this issue :-) I reply as well for #80 I'm restarting to update jetson-stats and I will update the docker code and implementation. (I was in holiday)

To work with jetson-stats in a docker you need to follow this bullet list:

  1. Install jetson-stats in your host (your jetson board)
    sudo -H pip install jetson-stats
  2. Make like this example this docker image (heck you installed jetson-stats with the same python version). Write a file Dockerfile in your folder
    
    FROM python:3-alpine

RUN apk update \ && apk --no-cache add bash \ && pip install jetson-stats \ && rm -rf /var/cache/apk/*

3. build your docker image
```console
docker build -t rbonghi/jetson-stats .
  1. Run jtop inside your docker
    docker run --rm -it -v /run/jtop.sock:/run/jtop.sock rbonghi/jetson-stats jtop

Keep me posted for any issues. :muscle:

rbonghi commented 4 years ago

Just to remind.

It is mandatory important to share the jtop socket between the host and the container. Do no forget to add in your docker run: /run/jtop.sock:/run/jtop.sock

muzammil360 commented 4 years ago

@rbonghi thanks a lot. I have been able to run it in docker. However, I don't know what unit exists with each item in stats. I can understand some of them but not all

assafzam commented 4 years ago

@rbonghi Thanks a lot! it works fine with your example. But I want to use the docker python image - buster. so the command apk --no-cache add bash does not exist. Can you explain what is the Purpose of this line? so I can try to translate it to buster..

rbonghi commented 4 years ago

Hi @assafzam ,

the line apk --no-cache add bash is needed only for an alpine distribution, because bash is not installed. For a distribution buster is not needed, you can use your Dockerfile in #80 .

swaroophs commented 3 years ago

Hello! Thanks for this project.

When I try to run using exactly this approach, I am getting the following error. Any ideas? I used the exact same Dockerfile that you have specified here.

docker run -it --env-file runtime.env --rm --network=host --name=oscar_jtop -v "/run/jtop.sock:/run/jtop.sock" jetson-stats jtop
Authentication mismatch with jetson-stats server

P.S: I did reboot jetson after installing jetson-stats on it natively. jtop works natively. It is while trying to do this with Docker I am having this issue

codethief commented 3 years ago

FWIW, I managed to get jetson-stats running inside a Docker container on BalenaOS (balena-cloud-jetson-nano-2.69.1+rev1-dev-v12.3.5.img) on a Jetson Nano:

First, I retrieved the files jetson_clocks and tegrastats from https://developer.nvidia.com/embedded/L4T/r32_Release_v4.4/r32_Release_v4.4-GMC3/T210/Tegra210_Linux_R32.4.4_aarch64.tbz2 . Inside that archive they can be found inside Linux_for_Tegra/nv_tegra/nv_tools.tbz2 (go to usr/bin).

Then I balena pushd the following Docker configuration to my Nano:

# docker-compose.yml
version: '2'

services:
  my_nvidia_container:
    restart: always
    build: ./my-nvidia-container
    privileged: true
# my-nvidia-container/Dockerfile

FROM balenalib/jetson-nano-ubuntu:bionic

# Don't prompt with any configuration questions
ENV DEBIAN_FRONTEND noninteractive

# Enable hot plugging devices, compare
# https://www.balena.io/docs/reference/base-images/base-images/#working-with-dynamically-plugged-devices
ENV UDEV=1

# Prevent L4T from updating the Jetson's internal firmware (bootloader etc.) as this is
# BalenaOS's job.
RUN \
    mkdir -p /opt/nvidia/l4t-packages/ && \
    touch /opt/nvidia/l4t-packages/.nv-l4t-disable-boot-fw-update-in-preinstall && \
    apt-get update

# Balena's Docker images for Jetson include Nvidia's deb repository. Which JetPack release this
# repository will point to exactly, depends on the BalenaOS host version, compare
# https://github.com/balena-os/balena-jetson/blob/master/CHANGELOG.md 
RUN apt-get install -y nvidia-jetpack

RUN apt-get install -y python3-pip
RUN pip3 install jetson-stats

# Adapt the paths depending on where you downloaded jetson_clocks and tegrastats to
COPY ./jetson_clocks /usr/bin/jetson_clocks
COPY ./tegrastats /usr/bin/tegrastats

Finally, I connected to the container via SSH:

$ balena ssh <my nano's IP> my_nvidia_container
$ jtop service &
[INFO] jtop.core.common - fan loaded on /sys/devices/pwm-fan
[INFO] jtop.core.common - jetson_clocks loaded on /usr/bin/jetson_clocks
[WARNING] jtop.core.nvpmodel - This board does not have NVP Model
[WARNING] jtop.service - NVPmodel does not exist for this board in paths ['nvpmodel']
[INFO] jtop.core.common - tegrastats loaded on /usr/bin/tegrastats
[INFO] jtop.__main__ - jetson_stats server loaded
[WARNING] jtop.core.jetson_clocks - I can't store jetson_clocks is already running
[INFO] jtop.core.fan - Mode set default status=False

$ jtop

Boom.

As for the warnings regarding nvpmodel, I also extracted nvpmodel from the same archive as jetson_clocks and copied it over to my Docker container but the warning did not disappear.

PS: As an alternative to using the Nvidia deb repository and putting the missing binaries in place by hand, one could also extract all the packages directly from the aforementioned archive at https://developer.nvidia.com/embedded/L4T/r32_Release_v4.4/r32_Release_v4.4-GMC3/T210/Tegra210_Linux_R32.4.4_aarch64.tbz2 , compare e.g. this example.

UPDATE: Turns out you don't need the entire nvidia-jetpack package but only nvidia-l4t-tools for jtop to work.

ChulanZhang commented 2 years ago

Hi @rbonghi

Thanks for helping. Actually I'm finding a way to check if my docker container can access GPU on Jetson Xavier. Now I can use tegrastats in my container if I share /usr/bin/tegrastats by a volume. I think jtop will work if all necessary files are shared.

Thanks very much!

Hi @yamiefun, May I know how did you do this? I added -v /usr/bin/tegrastats:/usr/bin/tegrastats when I run my container and there is an error when I tried to use tegrastats. 'tegrastats: /lib/aarch64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by tegrastats)'

rbonghi commented 1 year ago

This error should now be fixed with the latest release 4.0 try to install and check again

sudo -H pip install -U jetson-stats