zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.89k stars 6.63k forks source link

The history of the multi API / MFD discussions 2022 July - Sep #50621

Closed bjarki-andreasen closed 2 years ago

bjarki-andreasen commented 2 years ago

Intruduction

This issue will go through the entire MFD / multi API device model discussions starting in with the modem driver design issue. Please note that this is summarizing multiple PRs and issues with about 100 comments across them, some details will be left out.

The issue with modem driver design July 2022

Modems often contain multiple functions, which are not common between all modems. The BG96 contains a cellular modem, GNSS modem and GPIO controller. Each of these modules support multiple features. A simple graph of features is shown below:

BG96 {
    DFOTA (download firmware over the air)

    Cellular modem {
        network interface
        dtm (direct test mode)
        cellular info (RSSI, IMEI, towers)
        SMS
    }

    GNSS {
        location (lat, long, speed, altitude etc.)
        gnss (tracking mode, satelite count, power modes, NMEA frames)
    }

    QuecLocator (cellular location) {
        location (lat, long, speed, altitude etc.)
    }
}

This is the basis for the need for multiple APIs since the BG96 contains multiple functions, which may contain multiple APIs each. This issue was brought up at the architecture meeting, and no solution to the problem was suggested at that time.

The MFD devicetree parent/child solution July 2022 https://github.com/zephyrproject-rtos/zephyr/pull/47655

This solution to the problem is simple, just add every function or feature which requires its own API, be it an actual device or just a device struct being used as a handle, to the devicetree. Then instanciate a single driver for the top parent node which instanciates device instances for all features (child devices).

The issue proposed design rules and some macros which would help to keep the design and implementations of MFDs consistent.

BG96 {
    compatible = "quectel,bg96"

    cellular info {
        compatible = "quectel,bg96-cellinfo";
    };

    sms {
        compatible = "quectel,bg96-sms";
    };

    dtm {
        compatible = "quectel,bg96-dtm";
    };

    gnss {
        compatible = "quectel,bg96-gnss";
    };
};

The summed up opposing opinions to this solution at the time was:

The last opposing opinion sparked the research into multiple APIs pr device node.

The runtime API lookup solution Aug 9 2022 https://github.com/zephyrproject-rtos/zephyr/pull/48817

This solution pulled out the API pointer from the struct device, and placed it in an iterable section based on API type along with a pointer to the device struct it belongs to.

Device instantiation before:

struct i3c_driver_api;
DEVICE_DT_DEFINE(node_id, ..., &i3c_api);

Device instantiation after:

struct i2c_driver_api;
struct i3c_driver_api;

DEVICE_DT_DEFINE(node_id, ..., NULL)
I2C_API_DT_DEFINE(node_id, i2c_driver_api)
I3C_API_DT_DEFINE(node_id, i3c_driver_api)

The iterable sections would contain arrays of the following:

struct device_api {
    void *api;
    struct device *dev;
};

and the API wrappers would be rewritten to look through the section containing all devices which support the specific API, comparing the provided struct device with the one in the iterable section, to invoke the correct API implementation for the device.

The opposing responses to this solution at the time:

The multi API device model solution take 1 Aug 2022 https://github.com/zephyrproject-rtos/zephyr/pull/48817

This solution creates a single device struct and names it according to its api type, moving away from the generic DEVICE_DT_DEFINE(node_id, ..., api_ptr) to DEVICE_DT_API_DEFINE(node_id, ..., api_type, api_ptr) and DEVICE_DT_GET(node_id) to DEVICE_DT_API_GET(node_id, api_type).

This allows for multiple device instances pr device node, since a device instance is accessed using both the node and the api_type.

struct device *i2c_dev = DEVICE_DT_API_GET(node_id, i2c);
struct device *i3c_dev = DEVICE_DT_API_GET(node_id, i3c);

Note this solution doesn't work well with the PM subsystem as all devices which belong to the same node share the the same PM sub context. pm_action(i2c_dev, ...); is equal to pm_action(i3c_dev, ...);

All instantiated devices would also be placed in iterable section by api type with this solution.

There where no relevant opposing responses to this solution at the time, just some notes:

The multi API device model solution take 2 Aug 22 2022 https://github.com/zephyrproject-rtos/zephyr/pull/49374

This solution is identical to the first take, however, this was the full implementation of it. It is split into multiple smaller issues which we will go through now.

Where to declare API support and generate macros take 1 Aug 2022 https://github.com/zephyrproject-rtos/zephyr/pull/49374

The first solution was to use create a new top-level key in the bindings files in dts/bindings named api:. The required macros for the multi API device model where then generated by edtlib.py.

This solution also identified that it was not necessary to place devices in iterable sections at compile time (presuming all devices where declared in the devicetree) since we could generate foreach macros instead, greatlyr reducing complexity and allowing for compile time checks for if a device supports a specific API, or if any device supports a specific API, which could be used to auto exclude shells and subsystems which only work if those devices exists, and selecting the appropriate API when there are multiple potential APIs supported for a device, fx sensors often support both I2C and SPI, the macros could be used in place of the bus: property.

The opposing responses to this solution at the time:

Where to declare API support and generate macros take 2 Aug 2022 https://github.com/zephyrproject-rtos/zephyr/pull/49374

The properties files where invented, YAML files that extend the properties which can be defined for a device using the device compatible to link the properties in it with the devicetree nodes and bindings files.

The supporting responses to this solution

The opposing responses to this solution

At this point, the properties files are pulled out into their own PR.

Back to the multi API device model Sep 2022 https://github.com/zephyrproject-rtos/zephyr/pull/49374

The multi API device model using the properties files is discussed at this point. This section will look past the properties files, which have their own section later. As a response to this PR, the following solutions where presented / discussed:

The runtime API lookup solution with added flag for multi API devices

The solution treats single and multiple API devices differently, by using the api pointer inside the struct device if only one API is supported, and using the API lookup if more than one API is supported for said device. This differs from the first runtime lookup solution as this would aim to remove the api ptr from the struct device, which means every device would be treated equally.

This solution solves

This solution requires

Issues with this solution

Using device_api structs instead of device struct clones

The seconds solution was to use the multi API device model proposed in the PR, but changing device instance struct to the following:

struct device_api
{
    void *api;
    struct device *dev;
};

This would save ROM because only two extra pointers are added for each API (and one removed from inside the struct device), including the "primary" one, instead of the entire struct device.

This solution solves

This solution requires

Issues with this solution

Using the originally proposed MFD solution

The original solution was brought up again as a possible solution. Issues and requirements for solution will be repeated here.

This solution solves

This solution requires

Issues with this solution

Using the multi functional device model in the PR

This solution solves

This solution requires

Issues with this solution

Properties files PR Sep 2022 https://github.com/zephyrproject-rtos/zephyr/pull/50441

The supporting responses to the properties files on their own:

The opposing responses to the properties files on their own are:

bjarki-andreasen commented 2 years ago

Proposed ideal solution 1

Proposed solution

Using the api_device structs model, placing the api / device type information in the bindings files, generating macros using gen_defines.py / edtlib.py.

Supporting remarks

Requirements

Further arguments

The api: key in the bindings files is nearly identical to the Linux device_type: property, we could even call it device_type, but it is not as clear as api:. So there is precedent for having that information in the devicetree bindings files. There is even a for_each_node_by_type() function nearly identical to the proposed DEVICE_DT_API_FOREACH(fn, api_type) source: https://elinux.org/Device_Tree_Linux#device_type_property

The update of DEVICE_DEFINEs can be done gradually, and only require changing the api_ptr argument, which allows for an easy transition to this model.

The PM device subsystem is already remarked as being flawed and will be updated in the future in any case. For single API devices, the PM subsystem works as expected, it also works with multi API devices, but it is awkward.

Example

Bindings file

compatible: st,uart
api: uart

Device driver single API

static uart_driver_api api;
DEVICE_DT_INST_DEFINE(inst, ..., DEVICE_API(&api, uart));

Device driver multi API

static i2c_driver_api i2c_api;
static i3c_driver_api i3c_api;
DEVICE_DT_INST_DEFINE(inst, ..., DEVICE_API(&i2c_api, i2c), DEVICE_API(&i3c_api, i3c));

Application

const struct api_device i2c_dev = DEVICE_DT_API_GET(node_id, i2c);
const struct api_device i3c_dev = DEVICE_DT_API_GET(node_id, i3c);
gmarull commented 2 years ago

The MFD devicetree parent/child solution July 2022 #47655

This solution to the problem is simple, just add every function or feature which requires its own API, be it an actual device or just a device struct being used as a handle, to the devicetree. Then instanciate a single driver for the top parent node which instanciates device instances for all features (child devices).

The issue proposed design rules and some macros which would help to keep the design and implementations of MFDs consistent.

BG96 {
    compatible = "quectel,bg96"

    cellular info {
        compatible = "quectel,bg96-cellinfo";
    };

    sms {
        compatible = "quectel,bg96-sms";
    };

    dtm {
        compatible = "quectel,bg96-dtm";
    };

    gnss {
        compatible = "quectel,bg96-gnss";
    };
};

The summed up opposing opinions to this solution at the time was:

  • This solution is already in use in-tree

It is used e.g. by STM32 timers (specific functions of the timer are represented as child nodes: pwm, qdec, etc. Same approach is used in Linux, so we're not deviating here.

  • We should consider if the current device model is sufficient for MFDs moving forwards.
  • This solution deviates from the linux driver design as they support multiple APIs pr node.

The main problem I see in Zephyr really is that we use DT node identifiers (node labels, paths, aliases, etc.) to obtain struct device instances. This forces us to have a 1:1 relationship between DT nodes and struct device objects. This is the main difference with Linux, I guess. It's not really a problem, but a design choice that has important consequences on how we use Devicetree. For example, node labels are used in DT to reference to a node, e.g. in pwms = <&pwm1 0 ...>; but that's it. In my view, the fact we use them to obtain software devices means we're in practice using DT as a compile time source of software configuration.

  • features like sms and dtm are not actual devices, they are purely software, so they should not be in devicetree as nodes

They could also be seen as hardware blocks of the modem (some definitely are, like GPIO), but it's true that some cases may be at the edge. But as I said above, we don't use Devicetree as Linux does, and I think we need to be pragmatic in the end. Is it worth breaking the whole device model? Or is it better to allow for certain exceptions in the case of MFDs (allow for Zephyrisms)?

I solved the GD32 RCU case this way:

https://github.com/zephyrproject-rtos/zephyr/blob/e348fe494a21aad2bfd5744e0e2ce57262a24651/dts/arm/gigadevice/gd32f4xx/gd32f4xx.dtsi#L39-L55

RCU is a HW block that has both reset and clock control registers. I could just describe the RCU in DT (as ST does on Linux with RCC, same/similar IP). But to my understanding, it's not inaccurate to add a couple of sub-nodes to represent the reset and clock control blocks.

The last opposing opinion sparked the research into multiple APIs pr device node.

I get the point, but I think we need to be careful on breaking the device model for what it seems to be an edge case that has, in my view, a viable solution with the existing model.

bjarki-andreasen commented 2 years ago

@gmarull Sorry, but could you copy your comment to this discussion https://github.com/zephyrproject-rtos/zephyr/discussions/50641 ? I have created a copy in the discussions section instead of the issues section. I tried to close this issue but it did not work it seems :)

pdgendt commented 2 years ago

FYI; there is a "Convert to discussion" action for issues.

bjarki-andreasen commented 2 years ago

FYI; there is a "Convert to discussion" action for issues.

Ha, of course there is :)