home-assistant / supervisor

:house_with_garden: Home Assistant Supervisor
https://home-assistant.io/hassio/
Apache License 2.0
1.75k stars 639 forks source link

/proc/device-tree dead link in addon with `devicetree: true`... #4863

Closed mcarbonneaux closed 1 week ago

mcarbonneaux commented 9 months ago

Describe the issue you are experiencing

in addon with devicetree: true the /device-tree fs is correctly mounted...

but the link /proc/device-tree that point to /sys/firmware/devicetree/base (on docker host is ok) point to non existent directory...

image

can you add /sys/firmware/devicetree (is readonly on the container) directory, and add a link from /sys/firmware/devicetree/base to /device-tree.

this make dificulty to use frigate on odroid-m1 with use of the npu of the rockchip. https://github.com/blakeblackshear/frigate-hass-addons/issues/145

What operating system image do you use?

odroid-m1 (Hardkernel ODROID-M1)

What version of Home Assistant Operating System is installed?

11.4

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

use addon with devicetree: true and do in the container of the addon :

# ls -al /proc/device-tree

the target of the link (/sys/firmware/devicetree/base) are red....

Anything in the Supervisor logs that might be useful for us?

na

Anything in the Host logs that might be useful for us?

the link is ok on the host.

System information

System Information

version core-2024.1.3
installation_type Home Assistant OS
dev false
hassio true
docker true
user root
virtualenv false
python_version 3.11.6
os_name Linux
os_version 6.1.71-haos
arch aarch64
timezone Europe/Paris
config_dir /config
Home Assistant Community Store GitHub API | ok -- | -- GitHub Content | ok GitHub Web | ok GitHub API Calls Remaining | 5000 Installed Version | 1.33.0 Stage | running Available Repositories | 1378 Downloaded Repositories | 3
Home Assistant Cloud logged_in | false -- | -- can_reach_cert_server | ok can_reach_cloud_auth | ok can_reach_cloud | ok
Home Assistant Supervisor host_os | Home Assistant OS 11.4 -- | -- update_channel | stable supervisor_version | supervisor-2023.12.1 agent_version | 1.6.0 docker_version | 24.0.7 disk_total | 458.4 GB disk_used | 8.5 GB healthy | true supported | true board | odroid-m1 supervisor_api | ok version_api | ok installed_addons | Asterisk (4.1.2), Frigate Rockchip Beta (0.13.0) (0.13.0-rc1-rk), File editor (5.7.0), Advanced SSH & Web Terminal (17.0.4), RPC Shutdown (2.4), Mosquitto broker (6.4.0)
Dashboards dashboards | 1 -- | -- resources | 0 mode | auto-gen
Recorder oldest_recorder_run | January 17, 2024 at 18:18 -- | -- current_recorder_run | January 17, 2024 at 21:15 estimated_db_size | 0.50 MiB database_engine | sqlite database_version | 3.41.2

Additional information

No response

agners commented 8 months ago

Hm, I see, the Supervisor currently bind mounts /sys/firmware/devicetree/base to /device-tree. For compatibility reason we can't remove the existing one, but we could add another bind mount.

I guess the question here becomes is it possible to bind mount /sys/firmware/devicetree/base to /sys/firmware/devicetree/base :thinking:

mcarbonneaux commented 8 months ago

I guess the question here becomes is it possible to bind mount /sys/firmware/devicetree/base to /sys/firmware/devicetree/base 🤔

i think!

my original proposition is to create a link... bind mount is ok too!

agners commented 8 months ago

The /proc file systeme is special, I don't think it is possible to change the symlink at /proc/device-tree. I am guessing this is hardcoded in the kernel.

MarcA711 commented 8 months ago

yes, you are right. It seems impossible to create a symlink in or a bind mount to /proc. This means I will have to find another solution. Thank you for your help anyways!

MarcA711 commented 8 months ago

I finally found out how to access the device-tree within the container. It works with this command: docker run -it --security-opt systempaths=unconfined --security-opt apparmor=unconfined debian bash

I guess disabling apparmor using the option --security-opt apparmor=unconfined is what apparmor: false in the config.yml file already does. So, it would be great if there could be another option in the config.yml to allow accessing system paths by adding --security-opt systempaths=unconfined (e.g. using systempaths: false/ true).

I hope you will consider it!

agners commented 8 months ago

@MarcA711 maybe you missunderstood, my intention was to fix the actual symlink.

I guess the question here becomes is it possible to bind mount /sys/firmware/devicetree/base to /sys/firmware/devicetree/base 🤔

I am fairly certain that this actually works :point_up_2: . I just haven't come around to look into it. Stay tuned!

MarcA711 commented 8 months ago

Thank you for looking into this! Just two more comments to keep in mind:

Userspace must not use the /sys/firmware/devicetree/base path directly, but instead should follow /proc/device-tree symlink. It is possible that the absolute path will change in the future, but the symlink is the stable ABI.

Software that sticks to this convention will not work inside the addon if /proc/device-tree is not mounted.

YuryMcv commented 8 months ago

@agners Hi , Could you consider running the addon container in privileged mode, for example by adding "privileged == full" to config.yml. This is due to the fact that to launch the rockchip npu, it is not enough to just mount /proc/device-tree=>/sys/firmware/devicetree/base , you need more permissions. An attempt to run an NPU in a container in unprivileged mode results in an error

2024-02-12 19:35:55.670507576 E RKNN: [19:35:55.670] failed to open rknpu module, need to insmod rknpu dirver!
2024-02-12 19:35:55.670597116 E RKNN: [19:35:55.670] failed to open rknn device!
2024-02-12 19:35:55.771905809 E Catch exception when init runtime!
2024-02-12 19:35:55.775398747 E Traceback (most recent call last):
2024-02-12 19:35:55.775412746 File "/usr/local/lib/python3.9/dist-packages/rknnlite/api/rknn_lite.py", line 148, in init_runtime
2024-02-12 19:35:55.775469329 self.rknn_runtime.build_graph(self.rknn_data, self.load_model_in_npu)
2024-02-12 19:35:55.775485953 File "rknnlite/api/rknn_runtime.py", line 875, in rknnlite.api.rknn_runtime.RKNNRuntime.build_graph
2024-02-12 19:35:55.775591535 Exception: RKNN init failed. error code: RKNN_ERR_FAIL

Running the container --cap-add ALL does not solve the problem; when running in privileged mode, everything works fine. According to observations, this error occurs when running an NPU without root and disappears when adding sudo. Or perhaps suggest another solution.

mcarbonneaux commented 7 months ago

2024-02-12 19:35:55.670507576 E RKNN: [19:35:55.670] failed to open rknpu module, need to insmod rknpu dirver! 2024-02-12 19:35:55.670597116 E RKNN: [19:35:55.670] failed to open rknn device!

i've made separate issue (https://github.com/home-assistant/operating-system/issues/3089) one for adding rknpu the module (closed, we must wait the integration of the driver in mainline kernel), one here that can be usefull not only for running rknpu but i think can be usefull for other program that search device-tree in /proc or in /sys/firmware/devicetree/base

MarcA711 commented 7 months ago

@agners Sorry for pinging! Wanted to ask, if there was any progress here? I think this issue could be resolved by adding an option to pass the command line options --security-opt systempaths=unconfined .

github-actions[bot] commented 6 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

rightaditya commented 5 months ago

[Mostly to prevent issue from being auto-closed] I believe this is still an active issue.

nyok92 commented 5 months ago

Hi @agners @MarcA711 , is there any news on this ? I am available to test anything if needed. There may be a mainline kernel driver work ongoing : https://blog.tomeuvizoso.net/2024/04/rockchip-npu-update-3-real-time-object.html?m=1 Thanks

MarcA711 commented 5 months ago

Hey, unfortunately no news from my side. I would happily work on a HA add-on for frigate-rockchip, once there is a way to do it.

github-actions[bot] commented 4 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

nyok92 commented 4 months ago

Hi @MarcA711 The NPU Kernel driver has been submitted to mainline for review : NPU

Would it help to have access in hassio add-ons ? ( Frigate , ...) Thanks

MarcA711 commented 4 months ago

Hey, thank you!

I follow the kernel upstream process very closely and want to support mainstream in Frigate. We need full NPU support (this is just the kernel driver, userspace is still WIP and limited at this point) and de-/ and encoding. Decoding is WIP, Encoding is not being worked on afaik. It will take a couple of month until hardware NPU and Video Processing is upstream. Then I can maybe work on a add on for HA supervied. For HAOS it will take another couple of months until HAOS uses a supported Kernel (HAOS is usually a couple of versions behind for stability).

I am somewhat disappointed that such a (I think) small feature didn't make it... However, I understand that HA is a big project with a lot of stuff that needs to be worked on.

github-actions[bot] commented 3 months ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

rightaditya commented 2 months ago

Begone, stalebot!

github-actions[bot] commented 1 month ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

UnleashSpirit commented 1 month ago

I have the same issue while using plugin https://github.com/ironsheep/RPi-Reporter-MQTT2HA-Daemon MQTT says : cat: can't open '/proc/device-tree/model': No such file or directory

Vijabei commented 1 month ago

Is there an easy way to test if the NPU will be used by a given AI model? I am using HAOS on an Odroid M1. As I am not very familiar with everything, yet, I would be happy to do some testings with some supporting advices.

MarcA711 commented 1 month ago

Hey,

To use the NPU you have to write code (Python or C), see docs here: https://github.com/airockchip/rknn-toolkit2 And examples here: https://github.com/airockchip/rknn_model_zoo

I wrote this code to perform inference for object detection: https://github.com/blakeblackshear/frigate/blob/dev/frigate/detectors/plugins/rknn.py

Moreover, you need to convert your model to the rknn format.

There might be some projects that help you to convert the model or perform inference. Try to search on GitHub.

github-actions[bot] commented 2 weeks ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.