Open tbursztyka opened 4 years ago
CC @galak @pabigot @nordic-krch @erwango @MaureenHelm @nashif @anangl @henrikbrixandersen @mnkp @mbolivar
One more known (but relatively minor) issue that affects kernel footprint:
The model of placing function pointers in an API struct defeats gc-sections, as the functions will always have at least a reference in that API struct even if not referenced elsewhere. So if a driver is built, all of its API members will be pulled into the binary (and whatever they transitively pull in) even none of them, or only a subset of them, are invoked at runtime.
At the moment I have no proposals to remedy this, but thought it worth noting.
@tbursztyka: For Power Management you will need more complex dependency tracking, as you have several types of relation between devices. Examples:
@tbursztyka thank you for taking that initiative. Few comments:
I would add following requirements/topics:
No support for a timeout on each and every device API call
I assume that this is applies only to synchronous calls.
@andrewboie Indeed, very good point. I wonder if an ld script could check if a device instance is being used, besides its presence in the device init section, and would give a warning about it. Though the variable name of the device pointer would probably be not really user friendly (it could be generated from the ZID, thus an indirect way to relate to DTS and then to the driver it belongs). But that sounds convoluted, there is probably a better way.
@pizi-nordic Most of this has to be solved in DTS side. The goal is to embed as less information as possible. Note that a node child may have 1+ parents (spi master and gpio in case of a spi device using a gpio cs, for instance.). The hierarchy of nodes would have to be generated by DTS. On DTS side, a clock source would need to expose whether it can be stopped (or under-clocked etc...), if it's fixed: no need to make it as a parent. Dealing with clock gate, I'd prefer the driver to embed that info directly, though I did not think much of that issue. Could be made generic, let's see.
@nordic-krch I still have to take some time on your proposal 1. I am not yet convinced this require a generic point into device. It would need more use case I think. About point 2, yes that's a very good one. On DW IP blocks for instance it may share spi and i2s. It is indeed a common use case. However, we have to come up with a transparent solution, i.e. something that would not require the user to do anything. Partial solution is about the device lock mechanism, in case of 2+ dev sharing the same resource, such lock will have to be unique for these 2+ devs. But then indeed we need to know: "is dev_x initialized?" and "since I just got the lock from dev_2, how to de-init dev_1 and init dev_2?" and "if I just used dev_2 last time, and reuse it again... I don't need to re-initialize it then?" etc... Point 3 seems valid as well, and it is crossing point 2: the way to know if a device has been initialized or not, and thus reacting transparently based on that info.
No support for a timeout on each and every device API call
I assume that this is applies only to synchronous calls.
Can this be explained in more detail.?
The danger here is that you are pushing device driver specific knowledge up the stack and this becomes unmanageable very quickly. i.e. consider a driver API calls that polls on a HW state change. It could poll forever, but it's the responsibility of the driver (and not the core) to manage this timeout (and return -ETIMEOUT or whatever up the stack).
@lgirdwood There is nothing the core has to do in this case. The idea it to bring a common object into struct device so the drivers will be able to use afterwards in a normalized way, instead of letting each and every device driver reinventing the wheel on their own.
Adding a couple of ideas:
Expanding from the line of thought from @pizi-nordic, it'd be nice to have a concept of power domains in DTS. On TI SoCs and likely other ones out there, devices are grouped into power domains, and only when all devices in the domain are not needed then the entire domain can be shut off.
Another use for "device uninit" is to employ it as a way for the application to say it doesn't intend to use a device for a long time, so the device can be shut off. With several device drivers currently the driver can only assume a device is not needed whenever there is no on-going data transfer, in which case it'd just turn on and off the device around the "transfer" function for example. But if the application wants to do back-to-back transfers, this can result in an unnecessary large overhead. If there were a "device uninit" then the driver would have the option to turn on the device during init and shut it off when "uninit" is called.
@vanti the 1 is indeed a good point, however let's put it aside for the moment as it is DTS/PM only issue (there is a lot of issues with PM besides the dependency thing I mentioned. To put it mildly: it's a mess.). I think once we get the base fixed - issue 6 and 1 - we will be able to go further and fix PM completely. About 2, such use-case sounds just like suspend/resume to me. So I still don't see a proper use-case for "exposing" an uninit function. Note that "un-initializing" a device won't necessarily mean it will be shut down, from a power consumption perspective. And again, un-initiliazing a device which would be parent of other device... It's the issue 1.
@tbursztyka, updated PR #10885 in case somebody wants to play around with code generation.
@b0661 Sure, I am evaluating it along with other solutions.
deleted two duplicate comments, for the curious
So I get how iteratively enabling parents would be nice but iteratively disabling children seems sort of overkill. Wouldn't listing parents but only ref counting children be enough for pm?
Another point regarding deferred init etc:
Perhaps these shortcomings are not strictly a matter of the API lacking features. It could be argued that it's more a matter of what's considered idiomatic. If drivers were in general structured like this:
int foodev_init(struct device *dev)
{
// init mutexes, reserve memory, other setup-type tasks
#ifndef CONFIG_DEVICE_PM
// this is NOT done in init() if CONFIG_DEVICE_PM=y
foodev_do_poweron(dev);
#endif /* CONFIG_DEVICE_PM */
}
int foodev_pm_control(struct device *dev, u32_t ctrl_command,
void *context, device_pm_cb cb, void *arg)
{
// ...rather, it's done here:
if (ctrl_command == DEVICE_PM_SET_POWER_STATE) {
foodev_do_poweron(dev);
}
}
then all "real" inits would be deferred until the user actually requested the device to be powered on. But NO (upstream) drivers are written like this to my knowledge.
Ofc, supporting this idiom would be a lot easier if the device could programmatically access its direct device dependencies...
Providing device dependencies is not difficult: we already have them, we just haven't emitted them yet, perhaps because we haven't identified the canonical representation of a device handle. (The address of the struct device
would work, but that's a lot of space when we could use a u8_t
or u16_t
handle value to identify dependent and depending nodes in any reasonable Zephyr application.)
While we can provide the dependency information for device tree nodes, that doesn't help with coordinating dependencies between devices and other initializers (#23407), whether or not those are also updated to be based on dependencies rather than hard-coded level plus configurable priority values (which I'm concerned can produce hard-to-identify failures if the priorities are mis-assigned).
I have some ideas about how to do this but it involves either grotesque linker script hackery; a special link that produces data that is processed to produce a dependency tree that gets embedded into the application; or both. I don't think it's a viable path.
Aside: Maybe we should fork off a separate area to discuss specifically how to deal with initialization ordering, as there are probably other things related to this overview issue that will be obscured if we discuss too much here.
- Each device is not constant (same for their config part): It was meant to be constant, but due to a hack on device's API setting (in case of init failure it is set to NULL at runtime) it is never constant, thus ends into RAM instead of staying into the ROM. See #24873
Based on experience from #25208 I don't believe it will be possible to switch to using const struct device *
everywhere: too many APIs pass or store device pointers in void *
, which requires introducing casts that remove the const
qualifier. Comments to focus on:
const
removalProviding device dependencies is not difficult: we already have them, we just haven't emitted them yet, perhaps because we haven't identified the canonical representation of a device handle. (The address of the
struct device
would work, but that's a lot of space when we could use au8_t
oru16_t
handle value to identify dependent and depending nodes in any reasonable Zephyr application.)While we can provide the dependency information for device tree nodes, that doesn't help with coordinating dependencies between devices and other initializers (#23407), whether or not those are also updated to be based on dependencies rather than hard-coded level plus configurable priority values (which I'm concerned can produce hard-to-identify failures if the priorities are mis-assigned).
I have some ideas about how to do this but it involves either grotesque linker script hackery; a special link that produces data that is processed to produce a dependency tree that gets embedded into the application; or both. I don't think it's a viable path.
Aside: Maybe we should fork off a separate area to discuss specifically how to deal with initialization ordering, as there are probably other things related to this overview issue that will be obscured if we discuss too much here.
@pabigot if the dependencies are only about: 1. get the last level's parent(s) of given struct device
which will be used in device runtime pm where the number of reference children will be counted at runtime; 2. construct ordered list where parent will be at the front of children(devices will be suspended following this list and resumed in the converse order for the case of devices following system states), does that will simplify the implementation?
does that will simplify the implementation?
@wentongwu That's about what I'd assumed we'd do, though dependencies are static (I don't see why they would be dependent on power state). The blockers are representation of device handles, and how to manage dependencies between devices and non-device functions. The latter may need to be done manually in the driver source.
@pabigot Thanks, but what's the meaning of the representation of device handles
? And I see your comments https://github.com/zephyrproject-rtos/zephyr/pull/24514#issuecomment-637208934, also new version pm subsystem is expected in 2.4 cycle, but the first we should give a solution for the device dependency. And above I just give the two possible use cases about the device dependencies from pm perspective to see if the only two use cases instead of wanting the whole dependency graph/tree can simplify the implementation
Not sure if this has been discussed yet, but one thing I've run into is what to do about MMIO addresses, specifically if the MMIO address can only be known at runtime in certain configurations. So far this can happen in two scenarios:
I think we need a generic solution to store these MMIO ranges in RAM if they can only be known at runtime (i.e. in driver_data) , but otherwise keep them in ROM (in config_info) so we don't waste RAM.
I think a good solution for this would minimize the amount of ifdefs needed in code, and provide some generic interface for obtaining a named MMIO base address given a driver object, without having to know whether the value lives in driver_data or config_info.
This is just a rough sketch, but maybe something like a macro
DEVICE_GET_MMIO(device, member_name)
where it would return either device->config_info.member_name
or device->driver_data.member_name
appropriately, and macros to declare these members, one of which will always compile to nothing:
struct foo_device_config {
DEVICE_MMIO_ROM(uint8_t *, member_name);
...
}
struct foo_dev_data {
DEVICE_MMIO_RAM(uint8_t *, member_name);
...
}
API meeting 16th June 2020
Two options proposed:
struct device *
parameters to const struct device *
insteadstruct device
non-constThe issues with 1.:
device_get_binding()
calls would now return a const pointer, requiring a change in the implementation of many applicationsstruct device *
to pass it as a context to threads or HAL functions would have to down-cast from const struct device *
to void *
or store a pre-cast pointer to a non-const struct device *
struct dev {
struct device *d;
};
or
void *device_wrap(const struct device *dev)
{
return (void *) dev;
return (void *) dev->meta.non_const_dev_ptr;
}
const struct device *dev device_unwrap(void *p)
{
return (const struct device *) p;
}
with an instance of that somewhere.
@tbursztyka suggests using const struct device *dev
everywhere but storing a pointer to the struct device
in the driver_data
structure for drivers that need it.
For dynamic allocation, knowing if the instance is between the _start and _end of the driver linker section would be enough, in case the device driver actually needs to know whether it's in RAM and ROM.
Device drivers should not need to know whether the struct dev
instance is in RAM or flash, ever.
@mnkp There's a difference in performance/code size for both solutions. 2. requires an additional dereference for all operations in device drivers that include looking up anything non-mutable in the device structure.
https://github.com/zephyrproject-rtos/zephyr/issues/26072#issuecomment-645006182 has a summary of the impact of #25208 versus #26127 and a requested comparison (with limitations discussed there).
Some precision from my side:
The issues with 1.:
- All calls to device_get_binding() calls would now return a const pointer, requiring a change in the implementation of many applications
I don't think we nacked all the huge API reworks (gpio, adc, spi, ... and on going) we have got so far, on the fact it would require changes in the implementations of many applications. So to me, this is a false argument. Moreover in this case, it's really a small change.
- Device drivers that use struct device * to pass it as a context to threads or HAL functions
Again that's not a big change. Only a fractions of drivers would require that, not even mentioning threads. Same previous answer applies here as well.
About numbers given by @pabigot, the number of files being touched does not matter as soon as one solution is clearly better than the other one (in this case, const dev is really the one to pick). It would indeed matter if it was fully comparable. The gain of RAM/ROM might not look tremendous, but that's already a relevant one. It certainly does not play any role in boards with 1Mb and more, but it does on all the ones that have below.
Last but no least, part of the gain made by such rework, will be useful to improve provided features to drivers (such as #24511 for instance, which is not going to eat up all that gain since many drivers implement such solution on their own currently, it's just a matter of normalizing the behaviors under a central API).
I've rebased #25208 on the same master commit as #26127 to improve comparability. Because #25208 is closed I can't push an update, and I don't want to submit another one, so if you want to reproduce the results you'll need to pull specific commits from my repository. The links and specific commits used are documented in the results. (Also the rebase, unlike the original, does not attempt to fix every piece of code that produces warnings.)
See https://gist.github.com/pabigot/d63ddfa24a30e7a95e902fe986cbbd89 for the script and latest results.
zephyr-v2.3.0-309-g999c59c1ec627a6fff7db23b base zephyr-v2.3.0-319-geb7937b69d7c096b3be2b153 const ref
https://github.com/pabigot/zephyr/commits/nordic/issue/22941a
zephyr-v2.3.0-322-g046ceb3a1405336827bc7200 sub-object
https://github.com/pabigot/zephyr/commits/nordic/20200611a
While I do not dispute that transition to const struct device*
provides the maximum savings in RAM, my concern remains the impact of this change on existing in-tree and out-of-tree code. This is not a small change, as you can see if you look at what had to change in the samples to eliminate compiler warnings.
Another potential resolution proposed in the 2020-05-14 dev-review is to put all the immutable data directly into struct device
, but rely on the linker scripts to put the .device_PRIO
section into non-volatile memory.
This would mean all existing API using struct device *
would be unchanged, but we'd not use RAM to store the referenced objects. Best of both worlds.
But it would also eliminate any chance of language-level detection when somebody writes through the pointer.
On many targets such a write would silently do nothing. On some it might cause a bus fault. On others where RODATA is stored in a memory technology that is not write-protected (e.g. FRAM) it could be catastrophic.
It's still an option.
rely on the linker scripts to put the .device_PRIO section into non-volatile memory.
That was always the plan so far, to force the devices instances to be in ROM (and not anymore in this RAMABLE_REGION, ROMABLE_REGION at it is right now).
But it would also eliminate any chance of language-level detection when somebody writes through the pointer.
That's a bad idea, you really want to know at compile time if you are using the APIs and the objects as they are meant to be used.
Dev Review 18th June:
Original reason for moving the immutable part of the device information to ROM:
Reasons for passing around const struct device*
instead of struct device *
:
struct device *
are not to modify the structure directly, Instead, a common device driver infrastructure with a defined API would allow device drivers to use the common mechanisms for all, without requiring drivers to provide their own.Options for a way forward:
@nashif : Should we focus on finding other places in the code that are better candidates to save RAM and ROM instead? @pabigot: I'd rather stay with option 1 for now given what we've seen @nashif: We should be very careful about taking a decision like this because we've gone through issues like this before. The future needs to be considered. @jfischer-phytec-iot if we do it we should do the subobject, and we should probably do it
@nashif This information needs to be summarized and presented to a wider audience. Why are we doing this, what is the cost (based on which solution) and then what is the gain in the present and for the future. In particular this should be decided in the TSC.
I've updated https://gist.github.com/pabigot/d63ddfa24a30e7a95e902fe986cbbd89 with new results comparing the alternatives. I'd missed the commit that changed the linker scripts to place the device list in ROM instead of RAM. This was obscured because size(1)
does not accurately capture the true memory usage. Instead I've switched to using the memory region summary produced by the linker and displayed when things are built.
The overall conclusions do not change, but the numbers are more justifiable.
Maybe a totally different take, and related to #30105, if redoing the device model it would be a great to avoid the function pointer table altogether in many/all cases by perhaps using _Generic and driver specific types with a macro wrapper like i2c_transfer(), which is about as good as we can get in C for static dispatch. It might let a linker GC more function symbols and reduce the size of the elf along with other performance benefits which we care about. Additionally since we know all the drivers that perhaps might use a particular API from DTS, the macro wrapper can be code gen'd rather than maintained by hand perhaps.
One of the features sorely lacking from Zephyr is a controlled, reverse dependency order shutdown of the RTOS. One use case for this is when shutting down a multi-threaded instance of MCUBoot and transitioning to another executable (which could even be different instance of MCUBoot running from RAM instead of Flash). The hardware state during this handoff from one executable to another needs to be well controlled, and not necessarily inactive (e.g., some GPIO hardware or graphics display buffers may need to stay active, hardware watchdog may need to stay running). For a scalable, composable solution, this will need to be registration-based, perhaps as an extension to SYS_INIT().
Adding reference to new discussion thread on metadata tracking - https://github.com/zephyrproject-rtos/zephyr/issues/49431
API meeting:
spi_*()
, i2c_*()
) (including Power Management) using the same struct device *
(or equivalent context) in each call
struct device *dev = DEVICE_GET_*();
spi_transceive(dev); i2c_write(dev);
- To be able to list all devices of a specific class at compile-time
I have summarized the discussions and conclusions regarding the last months of discussion regarding MFDs and the device model into an issue. It is a 2-3 minute read approximately. https://github.com/zephyrproject-rtos/zephyr/issues/50621
Hi @galak, @gmarull, @ceolin,
This issue, marked as an Enhancement, was opened a while ago and did not get any traction. It was just assigned to you based on the labels. If you don't consider yourself the right person to address this issue, please re-assing it to the right person.
Please take a moment to review if the issue is still relevant to the project. If it is, please provide feedback and direction on how to move forward. If it is not, has already been addressed, is a duplicate, or is no longer relevant, please close it with a short comment explaining the reason.
@tbursztyka you are also encouraged to help moving this issue forward by providing additional information and confirming this request/issue is still relevant to you.
Thanks!
(This issue is a replacement of #6293 )
A deep revise and upgrade of the device driver model
Throughout its short existence, Zephyr has seen its kernel reworked, stabilized and made reliable with sound and concise APIs and objects. All still being designed within the scope of running on tiny targets, such as arduino_zero with 32k of RAM or bbc_microbit with its 16k of RAM for instance, but also providing enhanced features for fully fledged industrial boards with much more memory and CPU power.
Next to the kernel, the device driver ecosystem has experienced somewhat a different development. The device driver APIs have been improved to cover most of identified generic use-cases, and are still being reworked, sometime deeply refined to adress new requirements. Surprisingly however, the core of this ecosystem: the device driver model, has gained little to no improvement at all. It does not make it void of issues, far from it actually. In fact, some of the issues, known from the beginning sometime, have been a drawback for implementing advanced features. Worse, as time goes on with the device driver ecosystem growing everyday, these issues will tend to be more and more tedious to get fixed. It is not to say there are no attempts to fix some the known issues, there are. But I believe they are not generic nor concise enough, though some are definitely interesting test beds.
Hence a list of the known issues and a proposal to address these.
Identified issues
[X] (1) Power Management is broken: The lack of dependency tree support among devices makes it unusable. The burden of managing client devices and their parents within the power management context is pushed to the user which has been proven to be tedious, unreliable and thus made the PM subsystem nearly - if not fully - unused at all. Due to this most device drivers do not implement the PM API.
[ ] (2) No support for cancelling on-going device API call: In PM context or else, it might be relevant to cancel an on-going call on a device. Stopping and/or re-initializing a user thread would require such feature for instance. A caller could also decide to cancel its own calls. See #24511
[ ] (3) No support for a timeout on each and every device API call: Being generic to all device driver API such feature is one of the important missing. It may happen, either because the system is overloaded or because the hw is not responding, that a call will never come to a closure. Thus locking up the device on the call as well as the caller waiting for something to happen. This prevents also the caller having the possibility to mitigate any error in case the call would be unsuccessful. See #24511
[ ] (4) No error storage when device initialization fails or gets into hw fault: In case of initialization failure, the device will just be inaccessible, and there would be no way to know what happened. In case of a run-time, non-blocking, hardware fault, only the user will get an error back. Localized logging cannot be sufficient in this case. See #23589
[ ] (5) No possibility to recover a device in case of initialization failure or hw fault: Such failure could be due to boot timing, temporary hw issue, cosmic rays, gremlins, etc... Currently, a device failing the initialization step will not be usable at all afterwards. One can decide to reboot, but if the boot sequence is the culprit, the device will never be recoverable. It should be possible to call init again. Either to fix such booting issue but also at run-time where recovering a device may require to reinitialize it (after a locking hw fault for instance). See #23589
[X] (6) Device instantiation A device is uniquely allocated in device driver code through DEVICE_DEFINE() or DEVICE_AND_API_INIT() or DEVICE_INIT(). It requires as many calls of one of these macros as there are devices then. The link with DTS is non-existent unless the developer makes it happen using DTS generated config options, which is the current process.
[X] (7) Each device is not constant (same for their config part) It was meant to be constant, but due to a hack on device's API setting (in case of init failure it is set to NULL at runtime) it is never constant, thus ends into RAM instead of staying into the ROM. See #24873
[ ] (8) The device name: It is only used to get the device bindings at runtime and thus reduce boot performances and takes ROM/RAM for nothing. https://github.com/zephyrproject-rtos/zephyr/pull/49352
[X] (9) Nameless device created by SYS_INIT(): It does not make any use of the name, API, PM API and thus use too much memory for nothing. See #23407
[X] (10) device_pm structure An obvious case of under optimized structure
[X] (11) Device's variable name Is not generated from the device name (you cannot un-stringify on per-processor level), thus requiring a hard-coded unique and un-quoted name. That issue is due to the allocation macros (see issue 6).
Other know issues already recorded:
[ ] (12) Device initialization order that respects device tree dependencies #22545 Related prep-work: https://github.com/zephyrproject-rtos/zephyr/pull/49318
[ ] (13) Adding support for multi-functional-devices #48934
Solutions
(Follow the same order as the issue listing)
PM dependency tree
(issue 6 needs to be fixed first)
DTS, as its name tells, knows every devices and their parent/child that will exist in the build for a particular board/SoC. Thus it makes fully sense to use this information to create the PM dependency tree. For instance, suspending SPI master device x will first suspend all its related SPI devices k, n and m before actually suspending x.
From a technical perspective, it's just an oriented graph. Thus all well understood algorithm managing it. From an implementation point of view however, it's very important to keep in mind such graph could be greedy in memory depending on how many devices are present, and more importantly, how many relationships exist among them obviously.
Hence, a proposal here that could help to keep a tiny memory footprint:
Where such child array would be, for the SPI device x:
Note that device_<k/n/m/end> are not pointers, but the result of a simple operation:
and device_end would be:
That one would be used internally to know where the array stops.
All of it would be generated by DTS and the linker would do the rest. It takes twice less space compared to full pointers.
A completely separate tree could also be created, not in device_pm structure as seen above. It would result in more CPU usage in tree traversal to retrieve if a device being suspended or resumed is in there as a child or a parent. However, note that any accessed device will require anyway to retrieve its potential parent(s). So as a device tree is unlikely going to be thousands of nodes anyway, such traversal could be fast enough. It's worth studying what is the best of all solutions here.
Storing the actual DTS tree itself in the binary is not the right path: it would take much more memory and anyway would require as much CPU power than above, if not more due to comparisons being made on strings.
I believe the node id size, u16_t here, could be generated by DTS as well depending on the amount of devices, it could even go down to u8_t. But that would mean DTS knows about the boot services too, which are actually made of struct device and thus part of the same device array. It is not the case at the moment and impedes fixing issue 12 as well.
As noted in the introduction, some existing work could be reused to fix this issue. For instance, the on/off API is close to this PM requirements and thus its core could be partly back-ported in PM but improved for various state and not only on or off. Mostly the logic of waiting for all nodes to switch from on to off and conversely could be reused for PM state switch among all nodes.
Cancelling on-going device API call
Adding a generic device API function such as device_cancel_call(struct device *dev) could do the job. Would there be a necessity to check on caller's thread id, it would require to store such id on each call, perhaps. In any case, it would require device structure to hold a pointer on a device driver specific function to actually properly cancel a call and cleanup the hw (reinitialize it if needed etc...).
Adding a timeout on each and every device API call
Such timeout would be a generic attribute to all devices, so added into struct device. The non trivial part is how to populate the timeout and with what value. Such timeout value could be a fixed one, provided via Kconfig and/or DTS. But it could also be computed at run-time for the bus specific API for instance where it is possible to evaluate rather precisely how much time a transaction would take. Based on the frequency, word size, length of the whole transaction etc... For the API providing a timeout directly, the value would be then then one given via such API. Or more specifically, the default value would be superseded by such value.
Perhaps such feature could be made optional, via Kconfig, for tiny targets.
Error storage when device initialization fails or gets into hw fault
Such initialization failure needs to be stored, but also when a hw fault arises. These are separate errors and thus should have there own storage space. One bit for each would be sufficient, so a bit field would go. It might be worth to store the error as well (either limited to 8 or 16 bits?).
It could be worth studying if the PM busy bit could be stored there as well, unless we generalize a lock on each and every device, reducing many device driver's internal data storage as well as generalsing lock mechanism for concurrent access. The PM state could also be stored in the same bit-field to simplify things.
Usefulness of such error/state storage would be:
Again, such feature could be made optional, via Kconfig, for tiny targets.
Recovering a device in case of initialization failure or hw fault
As soon as issue 4 is fixed, it would be trivial to relevantly call device's init function based on the stored error/state.
Device instantiation
This is by far the biggest and tedious feature. As noticed earlier, this would pave the way to fix most of the other important issues and would finalize the connection between DTS and the code by letting DTS auto generating everything orbiting around the device concept. Also noticed earlier, prototypes have been undertaken already but were too complex from there usage point of view. However on PR #10885, one can see some examples, and it reaches a point where developer would have very few thing to do on device driver side.
But it could be made even simpler. Something like:
Note that and could be removed once issue 12 would be undertaken. DTS would know that information, and thus would not require to get it from such device_declare() pseudo-code.
struct device and config part made constant:
That is trivial once issue 4 is fixed as the api pointer would not be tempered with anymore. A coccinel script could help to check the whole code-base for compliance to this rule as well.
Getting rid of the device name:
That one is actually very easy, once issue 6 is fixed. Again, DTS could generate a header file such as _generateddevices.h which would enumerate each and every device instance such as:
And the (infamous) device_get_binding() would just be a macro, replacing the "call" to device_get_binding(DT_LABEL_DEVICE_K) into device_k. It's in fact very trivial. A previous q&d patch I did back in the days was able to show off this feature, however not based on DTS at that time (it got barely introduced then) but on Kconfig. And it worked very well, the gain of ROM/RAM was interesting and the boot time was better as well, since device_get_binding() was not anymore retrieving a device by its name.
Nameless device created by SYS_INIT():
A simple struct device, and its components, refactoring would be sufficient. Again, that was already prototyped back in the days for the issue 8.
Optimizing device_pm structure:
Device's variable name
Once issue 6 fixed, that would also directly fix this issue. See the PR #20253 attempting to generate such unique id "ZID", though the PR goal is wrong as there is no need for making device init and device_get DT aware per-se. Connecting strongly DTS to device (variable name, instances etc...) would anyway make the whole device driver model DT aware. (See issue 6)
Device initialization order that respects devicetree dependencies #22545
That one would require also first to expose the software services to DTS. That's trivial imo. Then indeed sorting out the priority and levels out of the device tree would require some work on DTS side. Let's sort the details out directly in relevant issue.
Note that getting issue 6 fixed would greatly help that, in a sense no one would have to go all over the existing device drivers to fix the priority/labels there.
Final thoughts
I believe there is no proper way to fix most of these issues without deeply connecting DTS to the model. And as such, we should focus on fixing issue 6 (aka "instanciating device via DTS code generation")