zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.94k stars 6.66k forks source link

Composable binding for adding multiple SPI devices via shield, overlay, etc. #52948

Open glarsennordic opened 1 year ago

glarsennordic commented 1 year ago

Two device tree overlays cannot specify chip-select pins on a single SPI device.

For example, if a board revision introduces a device on an SPI bus, and then a shield also introduces a device on the same bus, the shield's CS pin selection will override the CS pin selection of the board revision.

This is because chip-select pins are configured as a unified property on the SPI device tree node itself, rather than on each sub-node, and Device Tree does not mix the introduced CS pin selections from each overlay.

As a concrete example, in NCS, we have the shield nrf7002_ek whose overlay contains:

&arduino_spi {
    status = "okay";
    cs-gpios = <&arduino_header 16 GPIO_ACTIVE_LOW>; /* D10 */

    nrf7002: nrf7002@0 {
        ....
    };
};

We also have a board revision of nrf9160dk_nrf9160_ns named 0_14_0, whose overlay contains:

&spi3 {
    cs-gpios = <&arduino_header 16 GPIO_ACTIVE_LOW>, /* D10 */
           <&gpio0 25 GPIO_ACTIVE_LOW>;
    mx25r64: mx25r6435f@1 {
        ....
    };
};

(Note that &arduino_spi is an alias for &spi3 with this board)

The default for cs-gpios also happens to be <&arduino_header 16 GPIO_ACTIVE_LOW>;

Now, compare the following subsections of zephyr.dts after various build commands:

Shield

west build -p -b nrf9160dk_nrf9160_ns -- -DSHIELD=nrf7002_ek

spi3: arduino_spi: spi@b000 {
    ...
    cs-gpios = < &arduino_header 0x10 0x1 >;
    ...
}

Revision

west build -p -b nrf9160dk_nrf9160_ns@0_14_0

spi3: arduino_spi: spi@b000 {
    ...
    cs-gpios = < &arduino_header 0x10 0x1 >, < &gpio0 0x19 0x1 >;
    ...
}

Neither

west build -p -b nrf9160dk_nrf9160_ns

spi3: arduino_spi: spi@b000 {
    ...
    cs-gpios = < &arduino_header 0x10 0x1 >;
    ...
}

Shield & Revision

west build -p -b nrf9160dk_nrf9160_ns@0_14_0 -- -DSHIELD=nrf7002_ek

spi3: arduino_spi: spi@b000 {
    ...
    cs-gpios = < &arduino_header 0x10 0x1 >;
    ...
}

When both shield and board revision are active, only the CS selection for the shield is entered into spi3.

This prevents both the shield and the board revision being used together.

In this particular case, a simple solution is to remove the CS specification from the shield, but that would not work if the shield did not happen to use what was already the default CS selection for the board.

I argue that Zephyr needs a way for device tree overlays to mix CS pin selections, although I am not sure how this should be accomplished, since this seems to be a limitation of Device Tree itself

mbolivar-nordic commented 1 year ago

This indeed is a limitation in the devicetree binding for SPI controllers. The single cs-gpios property, and the use of reg, leads to a solution which is not composable via overlays without conflicts of the type you documented above.

Options are, I guess:

  1. discuss upstream in linux and come up with an extended spi DT binding that is more composable
  2. extend our bindings for SPI controllers and devices to allow for composition in a backwards-compatible but zephyr-specific way

Neither one is simple. I'm not sure in particular what the implications of either eliminating reg from an "extended" binding, or allowing for reg conflicts, might be on existing drivers. I did a quick grep in drivers/spi and a lot of drivers are using the slave field from struct spi_config in what look like hardware-specific ways, as you might expect, so reg conflicts are likely nontrivial too.

mbolivar-nordic commented 1 year ago

@glarsennordic I think I may have spoken too soon when I asked you to file this as a 'bug'. I'm going to mark this as an enhancement and change the title a bit, since we are dealing with a limitation that is working as designed, and no easy solution seems to be at hand.

xyzzy42 commented 1 year ago

This is problem I've been annoyed by in Linux for over a decade.

The classic workaround I've found in Linux dts overlays or includes, is that the overlay which adds the new device also replaces the entire cs list property in the controller's node. E.g.:

/* base dts */
spi0: spi { cs-gpios = <&gpio0  1 0>; dev0@0 { reg=<0>; }; };

/* overlay-add-dev-1.dts */
&spi0 {
    cs-gpios = <&gpio0 1 0>, <&gpio0 2 0>; // added another gpio to the end
    dev1@1 { reg = <1>; }; // New device
};

/* overlay-add-dev2.dts */
&spi0 {
    cs-gpios = <&gpio0 1 0>, <&gpio0 3 0>; // added a different gpio to the end
    dev2@1 { reg = <1>; }; // New device
};

/* overlay-add-dev1-and-dev2.dts */
&spi0 {
    cs-gpios = <&gpio0 1 0>, <&gpio0 2 0>, <&gpio0 3 0>; // added both to the end
    dev1@1 { reg = <1>; };
    dev2@2 { reg = <2>; };
};

The problem here is we need to make a unique overlay for every possible combination of devices. I've seen Linux BSPs where there are dozens of overlays created to accomplish this.

In Linux, you often have a pinctrl-0 list that needs to be extended the same way as cs-gios. And not just for SPI. Adding more MIPI cameras to a multi-lane controller might need to add pinctrl's to the shared controller node for each lane that gets used.

So I think it's better to think of this not as only a SPI and cs-gpios problem, but as a more general problem of appending to a list in a dts fragment or overlay (I'm aware overlays are runtime thing not directly applicable to Zephyr).

I've thought of solving this via an extension to the dts language. Which has been done already for deleting nodes and properties, with the syntax:

/delete-node/ &dev1; // remove a node, rather than status = "disabled"
/delete-property/ boolean-prop; // remove a property named 'boolean-prop', which is of course quite different than boolean-prop = <0>

My proposal would be follow this pattern and allow for appending to arrays. This is sort of like how a Yocto recipe bbappend works. You can append, prepend, or remove an item from a list to e.g. add a new patch file or remove a configure option, without needing to duplicate the existing contents of the list.

&spi0 {
    /append-property CSIDX/ cs-gpios = <&gpio0 2 0>; // adds to the end of the existing cs-gpios property.
    dev2@CSIDX {
        reg = <CSIDX>;
    };
};

The identifier after append-property will be substituted with the index of newly added item. One could create prepend and remove operations too for orthogonality, but append seems like the most useful one.

mbolivar-nordic commented 1 year ago

dev2@CSIDX {

Where is this magic CSIDX macro coming from? I don't see how to solve the problem in this issue without a clear way to auto-increment that in some clean way from overlays. And even then you may run in to conflicts with the base BOARD.dts.

So I think it's better to think of this not as only a SPI and cs-gpios problem, but as a more general problem of appending to a list

I agree that that is a valid generalization of the problem discussed in this issue, but there are lots of problems with DTS syntax if you want to go in that direction. It's not just the lack of support for appending; modifying individual cells is a problem too. See e.g. the problem of modifying an interrupt priority in zephyr on a cortex-M. You have to manually copy the IRQ number since that's cell 0, then put your modified priority. Fun.

xyzzy42 commented 1 year ago

dev2@CSIDX {

Where is this magic CSIDX macro coming from? I don't see how to solve the problem in this issue without a clear way to auto-increment that in some clean way from overlays. And even then you may run in to conflicts with the base BOARD.dts.

It came from here: "The identifier after append-property will be substituted with the index of newly added item."

So the same code in the dts parser that processes /delete-property/ and then /append-property <id>/ knows to expect an identifier to follow append-property and sets it up to be substituted with the index of newly added item, which dtc clearly knows since it just added it to the property. Maybe the syntax would be better as reg =<&CSIDX>;, as it's more like a handle than a macro.

I think avoids any problems with the base board dts. It can have one device and then the fragment appends a second to the end of the list, and then later the base is modified to have two devices and now the fragment will append a third to the end without needing to change in any way.

Lists like pinctrl, where you don't need to refer to the index of the added item, are even easier.

So I think it's better to think of this not as only a SPI and cs-gpios problem, but as a more general problem of appending to a list

I agree that that is a valid generalization of the problem discussed in this issue, but there are lots of problems with DTS syntax if you want to go in that direction. It's not just the lack of support for appending; modifying individual cells is a problem too. See e.g. the problem of modifying an interrupt priority in zephyr on a cortex-M. You have to manually copy the IRQ number since that's cell 0, then put your modified priority. Fun.

So in Yocto layers, sometimes one has this problem too, wanted to change an existing item. Usually the method is to think of it as a remove + append. Say we want to move a device to an alternate set of pins or change drive strength:

/remove-property/ pinctrl-0 = <&pinctrl_dev_1a>;  // delete the matching item in the property
/append-property/ pinctrl-0 = <&pinctrl_dev_1b>;  // append new item

This requires that the original order of the items doesn't matter and we want to replace an item as identified by its value rather than its index. Which I think is conceptually correct for this pinctrl example.

I'm not familiar with the interrupt priority dts problem. Do you have a link?

I also think it's worth pointing out that any solution that involves new syntax for spi bindings can't possibly fix cortex-m interrupt assignment either!

petejohanson commented 3 months ago

@JordanYates Did you intend to close this? I don't see any recent PRs merged that would seem to change this?

JordanYates commented 3 months ago

I have no idea why Github decided to automatically close this, I just added a link to the issue in a downstream PR I merged :/ image