Open bjarki-andreasen opened 1 year ago
- This approach is limited to a 1-1 link between compatible and device driver, which is not compatible with drivers which support multiple devices.
This is not true, we can easily describe 1 driver to multiple dts compatibles in Kconfig, we do it today (see drivers/adc/Kconfig.ads1x1x
):
config ADC_ADS1X1X
bool "ADS1X1X driver"
default y
depends on DT_HAS_TI_ADS1013_ENABLED || DT_HAS_TI_ADS1014_ENABLED || \
DT_HAS_TI_ADS1015_ENABLED || DT_HAS_TI_ADS1113_ENABLED || \
DT_HAS_TI_ADS1114_ENABLED || DT_HAS_TI_ADS1115_ENABLED || \
DT_HAS_TI_ADS1119_ENABLED
One downside to the YAML file approach is we have to keep information in-sync information between the Kconfig and capabilities YAML file. If we shift this information to just the capabilities YAML than we introduce yet another "configuration" system.
- This approach is limited to a 1-1 link between compatible and device driver, which is not compatible with drivers which support multiple devices.
This is not true, we can easily describe 1 driver to multiple dts compatibles in Kconfig, we do it today (see
drivers/adc/Kconfig.ads1x1x
):config ADC_ADS1X1X bool "ADS1X1X driver" default y depends on DT_HAS_TI_ADS1013_ENABLED || DT_HAS_TI_ADS1014_ENABLED || \ DT_HAS_TI_ADS1015_ENABLED || DT_HAS_TI_ADS1113_ENABLED || \ DT_HAS_TI_ADS1114_ENABLED || DT_HAS_TI_ADS1115_ENABLED || \ DT_HAS_TI_ADS1119_ENABLED
Corrected in proposal
One downside to the YAML file approach is we have to keep information in-sync information between the Kconfig and capabilities YAML file. If we shift this information to just the capabilities YAML than we introduce yet another "configuration" system.
The YAML files do not configure anything. They declare that a device driver exists, which Kconfig symbol will include it, and what it will offer if it is included. If the device driver is changed to a degree where the declaration is no longer consistent with what the device driver offers, the declaration must be updated.
The only information from Kconfig which must be in-sync with the YAML files is the symbol which selects the device driver. This is a primary function of the YAML file, pointing out which specific Kconfig symbol will include the device driver.
There is a clear separation of responsibilities between YAML files and Kconfig. The YAML files declare device drivers and their capabilties. The capabilities of a device driver is determined by the source code. The source code is configured by Kconfig.
We already have a solution and pattern to the problem of having a driver enabled w/o devicetree node being enabled. That is what the depends on DT_HAS_TI_ADS1013_ENABLED
in Kconfig accomplishes. I'm sure there are some cases that still need to be added in Kconfig.
In general we should not be having multiple drivers for the same device. If this is really needed, Kconfig can describe the mutual exclusion between the drivers if needed.
We already have a solution and pattern to the problem of having a driver enabled w/o devicetree node being enabled. That is what the
depends on DT_HAS_TI_ADS1013_ENABLED
in Kconfig accomplishes. I'm sure there are some cases that still need to be added in Kconfig.
As mentioned in the issue, under Concerns and Unresolved Questions, this is expanded upon. The solution proposed with this issue also does the reverse check, is there a node which is enabled, for which any driver exists, which has not been enabled, and if so, notify the user of the existence of these drivers. The error reporting proposed in this solution is more helpful than causing a Kconfig dependency error as well.
In general we should not be having multiple drivers for the same device. If this is really needed, Kconfig can describe the mutual exclusion between the drivers if needed.
Having multiple drivers is beneficial for out-of-tree drivers, where you want to overwrite an existing driver with your own implementation. It is also useful for devices which can be interacted with in different ways, like a generic cover all driver, like MODEM_GSM_PPP, and a tailored driver, like MODEM_QUECTEL_BG9X. Turning it around, we should not disallow or prevent having multiple drivers which support the same devices either, it is a valid usecase.
In general we should not be having multiple drivers for the same device. If this is really needed, Kconfig can describe the mutual exclusion between the drivers if needed.
Hmm. I would not be surprised to see a chip vendor provide a licensed module with an enhanced device driver. Providing a configuration and build system that supports this seems to be consistent with the Project mission and vision.
select DEVICE_DRIVER -> Declares that this symbol enables a device driver
AFAICT, selecting select DEVICE_DRIVER only tells the config system that at least one device driver exists in the system. Won't that always be true outside of test cases? How is the DEVICE_DRIVER setting bound to this symbol?
select DEVICE_DRIVER -> Declares that this symbol enables a device driver
AFAICT, selecting select DEVICE_DRIVER only tells the config system that at least one device driver exists in the system. Won't that always be true outside of test cases? How is the DEVICE_DRIVER setting bound to this symbol?
In the current Kconfig output .config
, that is correct. This proposal adds a step to Kconfig.py (which has access to all individual symbols and their contexts) which creates a new output: selects.pickle
which is a dictionary containing all symbols which select any other symbol. For example:
This Kconfig symbol
config MODEM_QUECTEL_BG9X
bool "Enable quectel modem device driver"
select DEVICE_DRIVER
select DT_COMPAT_QUECTEL_BG95
select DT_COMPAT_QUECTEL_BG96
select DEVICE_DRIVER_API_NET_IF
becomes
selects = {
"MODEM_QUECTEL_BG9X": ["DEVICE_DRIVER", "DT_COMPAT_QUECTEL_BG95", "DT_COMPAT_QUECTEL_BG96", DEVICE_DRIVER_API_NET_IF]
}
in the new selects.pickle
output (pickle is just a object serialization format like JSON, but binary).
@bjarki-trackunit It seems to me that this proposal is changing the semantics of the 'select' keyword. If true, then @carlescufi's response in this #build-system message is appropriate here:
Ulf, the original author of Kconfiglib, was always very careful to match 1:1 the Kconfig language from upstream Linux. If we start deviating from it, we need a very good reason, because then we won't be using "Kconfig" anymore, but our own dialect.
I strongly prefer any Kconfig language deviations to be with new keywords (introducing easily recognizable extensions) rather than changing semantics (introducing inconsistencies).
@bjarki-trackunit It seems to me that this proposal is changing the semantics of the 'select' keyword. If true, then @carlescufi's response in this #build-system message is appropriate here:
Ulf, the original author of Kconfiglib, was always very careful to match 1:1 the Kconfig language from upstream Linux. If we start deviating from it, we need a very good reason, because then we won't be using "Kconfig" anymore, but our own dialect.
I strongly prefer any Kconfig language deviations to be with new keywords (introducing easily recognizable extensions) rather than changing semantics (introducing inconsistencies).
The semantics are not changed, select still selects a reverse dependency, we simply deduce meaning from the names of the selected symbols. Kconfig does not care if a selected symbol is named DEVICE_DRIVER, or ABCD, we can however reserve some symbol names and decide that they indicate some property.
We already do this with symbols like SERIAL_HAS_DRIVER and SOC_STM32U585XX, (just discovered TAINTEDBLOBS* as well). This solution essentially goes all the way with this approach by agreeing on a symbol naming convention that can be recognized by the build system.
The only change to Kconfig.py is just having it produce an extra file containing the symbol selections on a pr symbol basis.
@bjarki-trackunit do you have some code that implements at least a portion of this proposal?
P.S. The cryptic error messages are easily solved if you know that you can just search for the ord in generated headers (build/zephyr/include/generated/device_extern.h) and from there find the offending dts node.
@bjarki-trackunit do you have some code that implements at least a portion of this proposal?
So far I have simply made a quick and dirty addition to kconfig.py to export the symbols and their selections locally as a POC, if we agree on the overarching concept, using kconfig symbol names as metadata, we can go ahead with actual implementation in future PRs.
P.S. The cryptic error messages are easily solved if you know that you can just search for the ord in generated headers (build/zephyr/include/generated/device_extern.h) and from there find the offending dts node.
I would not say the error message is solved, you are describing a way of diagnosing the obscure error. The obscurity of the error is rooted in the "undefined reference to", and the fact that it is a compilation error.
The build system should ensure that the user has included all required drivers before compilation starts, and it should be able to help the user find an appropriate driver if it detects a device missing one.
This proposal provides a way of determining metadata from Kconfig, with that information, we can solve a host of issues with one system, instead of having many different bespoke solutions. The less rules, obscure errors and manual steps the user must recognize and perform to use the project, the better adoption will be.
I think a good check to implement would be to find a way to guarantee that all device tree nodes that are 'status = "okay"' are instantiated (have a corresponding driver included). I don't have an off the cuff suggestion yet on how to do this elegantly but one possible way is to simply have a pre-build check that goes over the generated zephyr.dts (final device tree) and over the full list of files included in the build and cross references device tree compatible strings to all DT_DRV_COMPAT defines found in the list of built files (Although I have a hunch that there could be a better way by using variables generated by device instantiation macros).
I would like to hear more input from others on an elegant way to implement this cross reference check.
I think a good check to implement would be to find a way to guarantee that all device tree nodes that are 'status = "okay"' are instantiated (have a corresponding driver included).
What problem are you trying to solve here? Having devictree nodes with status = "okay"
but no Zephyr driver instantiated is a perfectly valid configuration.
I think a good check to implement would be to find a way to guarantee that all device tree nodes that are 'status = "okay"' are instantiated (have a corresponding driver included). I don't have an off the cuff suggestion yet on how to do this elegantly but one possible way is to simply have a pre-build check that goes over the generated zephyr.dts (final device tree) and over the full list of files included in the build and cross references device tree compatible strings to all DT_DRV_COMPAT defines found in the list of built files (Although I have a hunch that there could be a better way by using variables generated by device instantiation macros).
I would like to hear more input from others on an elegant way to implement this cross reference check.
Crawling through code is not exactly an elegant solution IMO. DT_DRV_COMPAT is only used for drivers which only support one device compatible, some drivers support multiple compatibles. We also have many different macros which instantiate devices, NET_IF_DEFINE, DEVICE_DEFINE, DEVICE_DT_DEFINE etc. and even "worse" for this solution, we have macros, which contain DEVICE_DT_DEFINE(), meaning we need to run a preprocessor first. It is rather complex and will be hard to maintain.
The core of the solution in this proposal is to tell the build system about the drivers, in a way which keeps that information in one place without duplication. We can then use this information to implement the behavior we want.
Your example with status="okay" we can handle more elegantly if the build systems knows which drivers exists for that compatible, if any. A device can have the status okay without any driver, but if a driver exists, we can inform the user that he may want to enable one of the compatible drivers.
The solution should be helpful, without blocking the user, or requiring manual steps which could be automated, and the meta information should be provided by the driver developers, to help the end users (developer and end user are often the same people, but still :))
I think a good check to implement would be to find a way to guarantee that all device tree nodes that are 'status = "okay"' are instantiated (have a corresponding driver included).
What problem are you trying to solve here? Having devictree nodes with
status = "okay"
but no Zephyr driver instantiated is a perfectly valid configuration.
What is the logical sense of that? If one sets something to "status = 'okay'" then one almost expects the driver to automatically be enabled and included so that the node is in fact "okay". The fact that we need to enable the driver as well is almost a bit of a nuisance. But of course we can have multiple drivers that implement a compatible so I understand the necessity of the extra step of picking a driver - however, I would expect that having status = okay and no driver included in build should in fact generate an error.
@gregshue at this time I'm just brainstorming ideas and it's good to map out possible solutions and issues with each.
What about making the mappings from device tree bindings to device drivers? One could for instance link from "dts/bindings/watchdog/nxp,lpc-wwdt.yaml" to "drivers/watchdog/wdt_mcux_wwdt.c" and always in the same source tree. So for example, if one wants to add a custom driver, one would extend the existing binding in the directory tree of the custom driver and then the linking will search directories first in the extended yaml location, then in the location of any base bindings.
What are issues of linking dts bindings to one or more device driver implementations?
I think a good check to implement would be to find a way to guarantee that all device tree nodes that are 'status = "okay"' are instantiated (have a corresponding driver included).
What problem are you trying to solve here? Having devictree nodes with
status = "okay"
but no Zephyr driver instantiated is a perfectly valid configuration.What is the logical sense of that? If one sets something to "status = 'okay'" then one almost expects the driver to automatically be enabled and included so that the node is in fact "okay". The fact that we need to enable the driver as well is almost a bit of a nuisance. But of course we can have multiple drivers that implement a compatible so I understand the necessity of the extra step of picking a driver - however, I would expect that having status = okay and no driver included in build should in fact generate an error.
Not all nodes have drivers, even when the node is enabled, and some nodes are supported by multiple drivers, so we need the user to enable the one they want to use.
for instance link from "dts/bindings/watchdog/nxp,lpc-wwdt.yaml" to "drivers/watchdog/wdt_mcux_wwdt.c" ...
Consider that the link needs to still be meaningful when the driver implementation and/or filename changes. (E.g., when a file is refactored into separate files, when a driver supports some of the versions of a HW block.)
... and always in the same source tree.
For end users that are limited to reuse an upstream repository (no additional commits) fixing any mapping errors must be done through configuration settings and mappings in a different repository. If both repositories have parallel directory structures, is it the same source tree? 🤔
What are issues of linking dts bindings to one or more device driver implementations?
In addition to those above, consider:
@nashif @MaureenHelm The considerations I mentioned above are other examples of User Needs that need to be part of the conversation. Should I create issues for these as well?
@gregshue do you see any reason to ever have a block in device tree enabled (status = "okay") and not have a driver for it compiled in? I'm just curious what your take on this is. Here I'm talking only about nodes that have a compatible string (ie expect a driver to handle them).
🤔 ... Different users, different needs ... Is this good enough?
As a contracted designer of bitstreams for FPGA with embedded CPU cores, I need to generate and validate device tree files describing the memory-mapped blocks in the bitstream, so that my client's own firmware engineers have DT files ready for whenever they choose to implement drivers.
Appended NOTE: The bitstream itself can be verified outside of Zephyr, so the contractor may have no need to create drivers in Zephyr.
Introduction
Zephyr is missing documentation and build time validation of device drivers. This results in avoidable, obscure errors during development, and having to look through source code, Kconfig files and devicetree bindings files to determine the existence and capabilities of device drivers. This proposal adds a build step which uses predefined Kconfig symbols to determine device driver capabilities, which are used to validate device drivers early, providing helpful, descriptive errors, and for generating device driver documentation.
Problem description
An example devicetree fragment shows the modem
bg95
which needs a device driver which is compatible with"quectel,bg95"
Obscure errors during development
In this example, we have two drivers which are compatible with
"quectel,bg95"
,MODEM_GSM_PPP
andMODEM_QUECTEL_BG9X
.If we build an application without selecting either compatible driver, we get the obscure
error. If we select both, we get
These errors are obscure and they occur at the last stage of the build, costing time both in diagnosing the issue and processing the build itself.
Missing device driver documentation
If we search for the bg95 on the Zephyr documentations page, it will return only the bindings file, as it matches "quectel,bg95". Neither of the compatible device driver's Kconfig symbols MODEM_GSM_PPP and MODEM_QUECTEL_BG9X will be returned.
To consistently discover all compatible device drivers, knowing only the devices compatible string, we must go to the source code, search for all files which include the sanitized compatible "quectel_bg95", look for the CMakeLists.txt, files which include source code files, to determine which Kconfig symbol selects the individual device drivers.
To then determine which API the device drivers support, we must find the source file which includes the relevant DEVICE_DT_DEFINE(), and look to the type of the API struct passed to its API parameter. We can now search for the API on the Zephyr documentation page.
Proposed change
Use Kconfig select feature to select predefined symbols which are mapped to device driver properties by symbol name.
An example symbol for the MODEM_QUECTEL_BG9X driver follows:
Using the information declared using predefined mapped device driver properties, the obscure linker errors
and
will be replaced with the following errors which will show up early in the build, before compilation,
and
The solution saves time by catching device driver errors in an early stage of the build, and by returning helpful concrete errors. It also enables generating documentation for all existing device drivers, ensuring that searching for bg95 on the Zephyr documentations page will return links to all supported device driver's documentation pages, which internally link to relevant devicetree bindings, Kconfig symbols and API references.
Detailed RFC
The following steps are required to implement this proposed solution:
depends on DT_HAS_compat_ENABLED
to raise dependency errors, since this is now handled by device driver validation module.All predefined symbols connected with the devicetree with the prefix
DT_
are generated by dts which runs before Kconfig. Generic symbols likeDEVICE_DRIVER
will be declared in zephyr/drivers/Kconfig, Pattern matching symbols, likeDEVICE_DRIVER_API_<api>
may be declared where appropriate. This allows for out-of-tree device driver symbol definition.Proposed change (Detailed)
The proposal updates kconfig.py to additionally export all symbols which select other symbols to a dict named
selects.pickle
. It will convert the following symbolto
selects.pickle
is then used with the output from dtskconfig.dts
, and the predefined symbol naming convention, to identify device drivers, which are then exported as a dict nameddevice_drivers.pickle
.device_drivers.pickle
is then used with the output from dtsedt.pickle
and Kconfig.config
to validate device drivers as described in this proposal.device_drivers.pickle
is also used to generate device driver documentation.From the build systems perspective, it will go from this:
to this:
How to implement change step by step:
selects.pickle
selects.pickle
andkconfig.dts
and generatesdevice_drivers.pickle
device_drivers.pickle
,edt.pickle
and.config.
, and uses them to perform the validation described in this proposal.At this point, validation of device drivers is fully supported. Adding predefined symbols to existing device driver Kconfigs should begin here. A device driver validation error will only occur if a compatible device driver, required by an enabled devicetree node, has been declared using predefined symbols. The more device driver Kconfigs are updated, the better the validation support will be.
device_drivers.pickle
.Dependencies
The proposal requires updating Kconfig files for device drivers to use the new predefined symbols. It also strongly encourages not using
depends on DT_HAS_compat_ENABLED
which will be replaced withselect DT_COMPAT_compat
.Concerns and Unresolved Questions
Overlap between current Kconfig dependency error and device driver validation error
The devicetree currently generates symbols which are used in kconfig to determine if a compatible exists,
DT_HAS_<compat>_ENABLED
, which some Kconfig symbols depend on to determine if they should be selected. This causes a kconfig dependency error if a driver is selected with no supported devices in the devicetree.There is overlap between that feature and the validation offered by this solution. Kconfig runs before this solution, resulting in the error which would have been provided by this solution being silenced, as the build fails when Kconfig raises the dependency error.
Since the error raised by this solution is more descriptive and helpful, using Kconfig to invoke a dependency error in this case should be discouraged.
Alternatives
Using YAML files to declare device driver properties, instead of predefined Kconfig symbols. Pros using YAML files:
Cons