zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.03k stars 6.17k forks source link

Global Namespace Management Issue Scope #46152

Open gregshue opened 2 years ago

gregshue commented 2 years ago

Introduction

Users downstream of Zephyr have faced with name conflicts problems between existing sources, they are not allowed to change and sources from the Zephyr ecosystem that they don't want to modify. This RFC is to analyze the scope of the problem and identify existing dependencies/derivations between different "global" namespaces. Alignment here will help identify the constraints on possible solutions.

Problem description

Users downstream of Zephyr are faced with name conflicts problems between existing sources, they are not allowed to change and sources from the Zephyr ecosystem that they don't want to modify.

Proposed change

Once the scope of the problem seems understood, changes will be proposed in separate RFCs.

Detailed RFC

Namespace management is a pervasive concern across all branches of all forks of all repositories that may be integrated with or into the Zephyr Project ecosystem. This warrants a comprehensive, thorough capture of:

  1. The scope to be considered by the Zephyr Project
  2. The potential solution space to be addressed by the Zephyr Project
  3. The existing strategies and patterns that exist in the Zephyr Project repositories
  4. The existing range of identifiers in each of the global namespaces

System Definition

The Zephyr Project encompasses two distinct "systems":

  1. The composable, extensible embedded software system (content) provided to downstream users, e.g., a. Zephyr configuration/build/test runner/documentation framework, and specification of the extensibility mechanisms for integration with downstream code. b. Requirements, designs, implementations, configuration settings, verification test suites for the functionality in each repository.
  2. The (multi-)project management system containing the infrastructure, processes, and pipelines to support contributions to and periodic/on-demand publication of the embedded software system (above).

This RFC is focused on the composable, extensible embedded software system which must be reusable (same SHA) by the downstream product developers. The project management system is outside the scope of this discussion because reuse of it is not appropriate (as some settings must be changed).

Zephyr System Boundaries

The Zephyr Project provides both the IoT device software and at least one client application that interfaces with it (twister). It also expects other widely available applications to be used for verifications (e.g. mobile phone Bluetooth/BLE discovery service). This means that even from the perspective of the Zephyr Project, the "System" needs to include:

but would not include software provided outside the Zephyr ecosystem (e.g., mobile phone BT device discovery service, NTP servers, CURL, minicom)

NOTE: This definition of "System" also describes that created by many downstream users that locally reuse-and-extend the Zephyr ecosystem to produce their products and the artifacts needed to certify them.

These boundaries will ultimately be described by the System Requirements, fulfilled according to the Architecture Definition, and meet all the Stakeholder Requirements. The boundary descriptions must be independent of design decisions (e.g., specific devices or UX).

Global Namespaces

Within the Zephyr ecosystem there are two distinct types of namespace domains. Each may be considered "global" because symbols will be added from sources outside the control of and without the awareness of Zephyr Project.

Open Domains: those shared with and used by other applications and ecosystems:

Zephyr Project Domains: those used exclusively by the Zephyr ecosystem:

Relevant Use Cases

The following user needs are captured here for consideration during analysis:

Analysis

Most things in Open Domain are completely beyond the control of Zephyr Project. The best that can be achieved within Zephyr Project repositories is to minimize risk of a collision and simplify the process of perpetually resolving it as updates are merged in.

Downstream repositories extended to support the Zephyr RTOS already have requirement to perpetually avoid global namespace collisions with repositories provided by the Zephyr Project and with existing 3rd party and proprietary repositories that may be pulled in. In this case the best that can be achieved by Zephyr Project repositories is to stay within a predictable portion of each global namespace that will be used by the Zephyr Project, and provide a process for contributors and downstream users to adjust their code accordingly.

Solving each of these namespace problems introduces a change breaking backwards-compatibility at some level. Minimizing the number of breaking change events requires recognizing couplings that already exist between different namespaces:

1) Module names, CMake variable naming a) A ZEPHYR_${MODULE_NAME_UPPER}_MODULE_DIR CMake variable is generated for every module
b) A zephyr module name is controlled by the name: field in the module.yml file, and defaults to the module directory name if not set. c) The module directory name is always overridable by the importing west manifest file, so the default is never knowable to other modules.
2) Kconfig settings, CPP symbol names a) A CONFIG_* CPP symbol is generated for every Kconfig setting 3) Header pathnames, CPP symbol names a) The Zephyr header file multiple-inclusion guards (usually) encode the repository-relative include path. (e.g., #ifndef ZEPHYR_INCLUDE_ARCH_SPARC_THREAD_H_) 4) Ecosystem name, module names, header pathnames a) Many Zephyr header files are now located within a zephyr/ directory layer. Unfortunately, it is unspecified if this path prefix represents the Zephyr Project ecosystem, or if it represents the module within it (e.g., “zephyr”). 5) Subsystem/driver names, logging identifiers, setting IDs, shell commands(?) a) It is common for the names of subsystems and drivers to be used for logging identifiers. The name (or mapped equivalent) of a subsystem or driver is also being used in shell command identifiers. 6) Subsystem names, Documentation headers/labels/tags 7) Subsystem pathnames, testcase IDs 8) Component names, iterrable linker section IDs.

NOTE: These couplings mean that resolving the issue in one namespace may affect symbols in another. This possibly could result in "redundant" information due to one set of names being generated from other names.

Separately, Zephyr Project has an issue of not specifying the maximum number significant characters the code will use within each of the namespaces (or, conversely, the minimum number of significant characters a toolchain must support in each of the namespaces). By not specifying and exercising a specific limit the Zephyr Project is implying code will stay within the minimums required by the standards. (For C99, this limit is 31 significant characters in globally-scoped symbols and 63 significant characters in file-scoped symbols. Though the standard is not freely available, the limits are referenced in the freely available document http://www.open-std.org/jtc1/sc22/wg14/www/docs/C99RationaleV5.10.pdf).

NOTE: There exists code in the zephyr/ v3.0.0 repository that already does not meet this criteria (e.g., in $workspace/zephyr/include/bluetooth/audio/audio.h: the globally scoped functions bt_audio_broadcast_sink_scan_start() vs bt_audio_broadcast_sink_scan_stop() are identical until the 32nd character).

Proposed change (Detailed)

Once the scope of the problem seems understood, changes will be proposed in separate RFCs.

Dependencies

Once the scope of the problem seems understood, the dependencies will be captured within each separate proposal RFC.

Concerns and Unresolved Questions

Does this completely describe the scope of the problem space?

Alternatives

None expected to be identified in this scope analysis.

cfriedt commented 2 years ago

43987 is a WIP, but relevant

ceolin commented 1 year ago

https://github.com/zephyrproject-rtos/zephyr/pull/48963 is relevant as well.

I started some cleanup in macros being re-defined all over the place on Zephyr internal code. The problem is that moving those macros to a common header started to conflict with different code bases (HAL, modules, ...) since these macros don't have any namespace and use pretty common names.

Looking into it I identified very dangerous patterns in different repositories. Macros being undefining or checking if they are already defining. e.g: https://github.com/thesofproject/sof/blob/main/src/include/sof/bit.h#L16 https://github.com/zephyrproject-rtos/hal_gigadevice/blob/main/gd32e50x/cmsis/gd/gd32e50x/include/gd32e50x.h#L473

The problem is that depending on the inclusion order, you may end up with different macro implementation. For simple things like BIT() this may not be a problem, but for macros like GET_BITS() it is, because the parameters order change.