zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.62k stars 6.5k forks source link

Support circular DTS node references #57708

Open danieldegrasse opened 1 year ago

danieldegrasse commented 1 year ago

Is your enhancement proposal related to a problem? Please describe. Zephyr's current devicetree infrastructure does not support devicetree nodes having circular references. A few examples of this:

"Classic" circular reference

nodea: nodea {
    status = "okay";
    compatible = "vnd,deva";
    dev-ref = <&nodeb>;
};

nodeb: nodeb {
    status = "okay";
    compatible = "vnd,devb";
    dev-ref = <&nodea>;
};

Parent referencing child- this is also a circular reference

parent_node: parent {
    status = "okay";
    compatible = "vnd,dev-parent";
    child-ref = <&child_node>;
    child_node: child {
        compatible = "vnd,dev-child";
        status = "okay";
    };
};

Any of these devicetrees will fail to compile, as the current devicetree scripting cannot create dependency ordinal numbers for all devicetree nodes when a circular reference exists. The key issue here is that we need to determine which of these nodes has a higher initialization priority, which is not clear from the circular reference.

Describe the solution you'd like A method to support circular references in devicetree- One method that has been proposed by @mbolivar to handle this is treating circular references as a block of strongly connected components: https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm.

Additional context While this issue will affect any devicetree, the core driver behind the request is a way to support the sometimes complicate devicetree infrastructure needed to accurately describe display pipelines. One such example:

&mipi_dsi {
    status = "okay";

    ports {
        #address-cells = <1>;
        #size-cells = <0>;
        port@0 {
            reg = <0>;
            dsi_in: endpoint {
                remote-endpoint = <&dcnano_out>;
            };
        };

        port@1 {
            reg = <1>;
            dsi_out: endpoint {
                remote-endpoint = <&dsi_panel_in>;
            };
        };
    };

    rm68200@0 {
        status = "okay";
        compatible = "raydium,rm68200";
        reg = <0x0>;
        reset-gpios = <&gpio3 21 GPIO_ACTIVE_HIGH>;
        data-lanes = <2>;
        width = <720>;
        height = <1280>;
                pixel-clock = <6200000>;
        pixel-format = <MIPI_DSI_PIXFMT_RGB565>;
        width = <720>;
        height = <1280>;
        hsync = <8>;
        hfp = <32>;
        hbp = <32>;
        vsync = <2>;
        vfp = <16>;
        vbp = <14>;
        port {
            dsi_panel_in: endpoint {
                remote-endpoint = <&dsi_out>;
            };
        };
    };
};

cc @gmarull

mbolivar-nordic commented 1 year ago

I want to make clear that especially with my upcoming departure from Nordic, I have no plans to work on this any time soon. I also want to point out that @gmarull 's close collaboration will be needed, since any solution needs to be compatible with what we want for https://github.com/zephyrproject-rtos/zephyr/issues/22545, and that's really in the device model

josuah commented 5 months ago

needed to accurately describe display pipelines

It turns out that remote-endpoint is not used at all in the current drivers.

I would be glad to know more about the model you had in mind, as I wish to do the same for implementing UVC.

josuah commented 5 months ago

It looks like fetching the remote-endpoint works... I could get it going from end to end. I had to add the remote-endpoint property on a .yaml file, maybe that helped?

compatible: blah,blah

child-binding:
  child-binding:
    properties:
      remote-endpoint:
        type: phandle
        required: true

Then the devicetre actually started to validate the remote-endpoint like it is on Linux: requiring a parent that is called endpoint, whose parent is called port. So the validation logic seems hardcoded somehow, but this works as Video4Zephyr seems to preserve Video4Linux devicetree syntax.

josuah commented 5 months ago

Maybe this is the bug you encountered?

-- west build: generating a build system
Loading Zephyr default modules (Zephyr base).
-- Application: /home/tinyvision/zephyrproject/tinyclunx33_zephyr_example/example_usb_uvc
-- CMake version: 3.29.0
-- Found Python3: /home/tinyvision/.local/bin/python3 (found suitable version "3.11.8", minimum required is "3.8") found components: Interpreter
-- Cache files will be written to: /home/tinyvision/.cache/zephyr
-- Zephyr version: 3.6.99 (/home/tinyvision/zephyrproject/zephyr)
-- Found west (found suitable version "1.2.0", minimum required is "0.14.0")
-- Board: tinyclunx33, Revision: 2, identifier: litex_vexriscv
-- Found toolchain: cross-compile (/usr/bin/riscv-none-elf-)
-- Found Dtc: /usr/bin/dtc (found suitable version "1.7.0", minimum required is "1.4.6")
-- Found BOARD.dts: /home/tinyvision/zephyrproject/zephyr/boards/tinyvision/tinyclunx33/tinyclunx33.dts
-- Found devicetree overlay: /home/tinyvision/zephyrproject/tinyclunx33_zephyr_example/example_usb_uvc/app.overlay
Traceback (most recent call last):
  File "/home/tinyvision/zephyrproject/zephyr/scripts/dts/gen_defines.py", line 1103, in <module>
    main()
  File "/home/tinyvision/zephyrproject/zephyr/scripts/dts/gen_defines.py", line 97, in main
    write_top_comment(edt)
  File "/home/tinyvision/zephyrproject/zephyr/scripts/dts/gen_defines.py", line 226, in write_top_comment
    err("cycle in devicetree involving "
  File "/home/tinyvision/zephyrproject/zephyr/scripts/dts/gen_defines.py", line 1099, in err
    raise Exception(s)
Exception: cycle in devicetree involving /soc/usb@b0000000/zephyr_uvc0/input-terminal, /soc/usb@b0000000/zephyr_uvc0/input-terminal/port, /soc/tinygen@b1100000, /soc/tinygen@b1100000/port, /soc/tinygen@b1100000/port/endpoint, /soc/usb@b0000000/zephyr_uvc0/input-terminal/port/endpoint
-- In: /home/tinyvision/zephyrproject/build/zephyr, command: /home/tinyvision/.local/bin/python3;/home/tinyvision/zephyrproject/zephyr/scripts/dts/gen_defines.py;--dts;/home/tinyvision/zephyrproject/build/zephyr/zephyr.dts.pre;--dtc-flags;'';--bindings-dirs;/home/tinyvision/zephyrproject/zephyr/dts/bindings;--header-out;/home/tinyvision/zephyrproject/build/zephyr/include/generated/devicetree_generated.h.new;--dts-out;/home/tinyvision/zephyrproject/build/zephyr/zephyr.dts.new;--edt-pickle-out;/home/tinyvision/zephyrproject/build/zephyr/edt.pickle;--vendor-prefixes;/home/tinyvision/zephyrproject/zephyr/dts/bindings/vendor-prefixes.txt
CMake Error at /home/tinyvision/zephyrproject/zephyr/cmake/modules/dts.cmake:283 (message):
  gen_defines.py failed with return code: 1
Call Stack (most recent call first):
  /home/tinyvision/zephyrproject/zephyr/cmake/modules/zephyr_default.cmake:132 (include)
  /home/tinyvision/zephyrproject/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:66 (include)
  /home/tinyvision/zephyrproject/zephyr/share/zephyr-package/cmake/ZephyrConfig.cmake:92 (include_boilerplate)
  CMakeLists.txt:2 (find_package)

-- Configuring incomplete, errors occurred!
FATAL ERROR: command exited with status 1: /usr/bin/cmake -DWEST_PYTHON=/home/tinyvision/.local/bin/python3 -B/home/tinyvision/zephyrproject/build -GNinja -DBOARD=tinyclunx33 -S/home/tinyvision/zephyrproject/tinyclunx33_zephyr_example/example_usb_uvc

This is not reproducible by anyone (custom project), but at least shows some error message...

I only have it when two remote-endpoint points to the same phandle. When two remote-endpoint are interconnected, this seems to go smoothly. And given the remote-endpoint is hardcoded in the C libraries parsing device-trees, we might be in luck...

This change is what triggered the message above, otherwise it worked well:

diff --git a/example_usb_uvc/app.overlay b/example_usb_uvc/app.overlay
index 2a7c08c..ef16bbf 100644
--- a/example_usb_uvc/app.overlay
+++ b/example_usb_uvc/app.overlay
@@ -1,45 +1,56 @@
 &zephyr_udc0 {
        /* IN and OUT counting together as one endpoint */
        num-bidir-endpoints = <2>;

        /* Number of transfer requests for each endpoint */
        num-endpoint-trb = </* EP0 */ 2 2 /* EP1 */ 0 32>;

        /* full-speed: 12 Mhz, high-speed: 480 Mhz, super-speed: 5000 Mhz */
        maximum-speed = "high-speed";

        uvc0: zephyr_uvc0 {
                compatible = "zephyr,uvc";
                status = "okay";

                uvc0in: input-terminal {
                        compatible = "zephyr,uvc-input-terminal";

                        port {
                                uvc0ep: endpoint {
                                        remote-endpoint = <&gen0out>;
                                };
                        };
                };

                uvc0out: output-terminal {
                        compatible = "zephyr,uvc-output-terminal";
                        source = <&uvc0in>;
                };
        };
 };

 / {
        soc {
                tinygen@b1100000 {
                        compatible = "tinyvision,tinygen";
                        reg = <0xb1100000 0x0010000>;

                        port {
                                gen0out: endpoint {
                                        remote-endpoint = <&uvc0ep>;
                                };
                        };
                };
+
+
+               something-weird {
+                       compatible = "zephyr,uvc-input-terminal";
+
+                       port {
+                               endpoint {
+                                       remote-endpoint = <&uvc0ep>;
+                               };
+                       };
+               };
        };
 };
josuah commented 5 months ago

Further insights by tinkering with yaml devicetree bindings:

Given zephyr/dts/bindings/video/video-device.yaml as:

# Copyright (c) 2024 tinyVision.ai Inc.
# SPDX-License-Identifier: Apache-2.0

description: |
  Video device common definitions
  This follows some of V4L2 conventions:
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/devicetree/bindings/media/video-interfaces.yaml

child-binding:
  description: |
    A port entity of a device
    Allowing devices with multiple input and output connection, such as an
    HDMI stream selector with many input ports and one output.

  child-binding:
    description: |
      An endpoint entity of a port
      Allowing devices with multiple channels per port, such as MIPI virtual
      channels.

    properties:
      remote-endpoint:
        type: phandle
        required: true
        description: |
          Reference to a Video endpoint
          This Video endpoint should implement the "video" driver API, and will
          provide data either directly, or through an associated video device.
          For instance, an image sensor providing data over a MIPI peripheral.

And the given app.overlay:

&zephyr_udc0 {
        /* IN and OUT counting together as one endpoint */
        num-bidir-endpoints = <2>;

        /* Number of transfer requests for each endpoint */
        num-endpoint-trb = </* EP0 */ 2 2 /* EP1 */ 0 32>;

        /* full-speed: 12 Mhz, high-speed: 480 Mhz, super-speed: 5000 Mhz */
        maximum-speed = "high-speed";

        uvc0: zephyr_uvc0 {
                compatible = "zephyr,uvc";
                status = "okay";

                uvc0in: input-terminal {
                        compatible = "zephyr,uvc-input-terminal";

                        port {
                                uvc0ep: endpoint {
                                        remote-endpoint = <&gen0out>;
                                };
                        };
                };

                uvc0out: output-terminal {
                        compatible = "zephyr,uvc-output-terminal";
                        source = <&uvc0in>;
                };
        };
};

/ {
        soc {
                tinygen@b1100000 {
                        compatible = "tinyvision,tinygen";
                        reg = <0xb1100000 0x0010000>;

                        port {
                                gen0out: endpoint {
                                        remote-endpoint = <&uvc0ep>;
                                };
                        };
                };
        };
};

This works

zephyr/dts/bindings/video/tinyvision,tinygen.yaml:

description: small pattern generator
compatible: "tinyvision,tinygen"
include: video-device.yaml

zephyr/dts/bindings/usb/uvc/zephyr,uvc-input-terminal.yaml:

description: USB Video Class Input Terminal entity
compatible: "zephyr,uvc-input-terminal"
#include: video-device.yaml

This works

zephyr/dts/bindings/video/tinyvision,tinygen.yaml:

description: small pattern generator
compatible: "tinyvision,tinygen"
#include: video-device.yaml

zephyr/dts/bindings/usb/uvc/zephyr,uvc-input-terminal.yaml:

description: USB Video Class Input Terminal entity
compatible: "zephyr,uvc-input-terminal"
include: video-device.yaml

This fails

zephyr/dts/bindings/video/tinyvision,tinygen.yaml:

description: small pattern generator
compatible: "tinyvision,tinygen"
include: video-device.yaml

zephyr/dts/bindings/usb/uvc/zephyr,uvc-input-terminal.yaml:

description: USB Video Class Input Terminal entity
compatible: "zephyr,uvc-input-terminal"
include: video-device.yaml

With the error that the tinyvision,tinygen node and zephyr,uvc-input-terminal both have the _ORD number set to -1:

devicetree_generated.h.txt

Maybe this is the same bug.

At least, in my case, the temporary workaround is to set a YAML binding rule like child-node: child-node: property: remote-endpoint: type: phandle on exactly one of the two link endpoints. This is terribly ad-hoc, but allows to keep experimenting, to hopefully discover more about this bug.

danieldegrasse commented 4 months ago

At least, in my case, the temporary workaround is to set a YAML binding rule like child-node: child-node: property: remote-endpoint: type: phandle on exactly one of the two link endpoints. This is terribly ad-hoc, but allows to keep experimenting, to hopefully discover more about this bug.

I am not a devicetree expert- but my understanding is that by only adding the child-node properties pulled in by including video-device.yaml, the devicetree python scripting would only parse the remote-endpoint for one of the endpoints, not both. If the python scripting skips a node in devicetree (IE does not validate it), it will not be included in the output generated devicetree header file- so it wouldn't be usable from a Zephyr driver. Were you able to find a way around this issue, or does it not occur for you?

decsny commented 4 months ago

What are the actual use cases of circular references in DT? Ie why do nodes need to reference each other, if their corresponding devices truly both depend on each other being initialized, then it seems impossible to initialize either correctly? In that case it seems like it's not a problem from DT, but from some part of the software architecture being nonsensical, and don't see how allowing this in DT would solve a problem, it would only reveal the more fundamental problem with the software design.

Otherwise, is it reasonable to decide that DT nodes with circular references just would not imply dependencies in either direction? or maybe even have the same init priority (doesn't seem strictly necessary in all cases, though)?

josuah commented 4 months ago

If there is a convention that building pipelines is always done with the source referring the sink (or the other way around), then there might not need any need for going the other direction...

Following the flow of the data down the pipeline seems more intuitive: data is pushed from the source to the sink whenever it arrives (rather than pulled by the sink).

That might depend if controls need to be propagated in both direction: how is the pipeline driven. If one needs to query "where does the data comes from". Maybe a display that tries to negotiate the source parameters when the user hits buttons to pick a different resolution...

josuah commented 4 months ago

It looks like there is a guard against circular dependencies in dtc precisely for remote-endpoint: https://github.com/torvalds/linux/blob/dccb07f2914cdab2ac3a5b6c98406f765acab803/scripts/dtc/checks.c#L1765-L1766

bbilas commented 2 months ago

I encountered the same issue while developing a custom interlocking mechanism that requires circular references. Has anyone tried to fix this?