This RFC was closed in favor of a full summary in https://github.com/zephyrproject-rtos/zephyr/issues/76335

Based on discussion in #17162 the following describes a proposed API extension that is enabled by #17155 given a few minor API changes.

Proposed Zephyr Clocks

I would like the end goal to be support for the following clocks:

K_CLOCK_HARDWARE corresponds to the Zephyr hardware clock. It is provided by k_cycle_get(). This is a monotonic non-decreasing non-wrapping 64-bit source with zero at the instant the system hardware clock is initialized. A duration of 1 corresponds to a Zephyr cycle. The rate Z_HZ_cyc is a constant number of nominal cycles per second, but the actual rate may vary, and may not be a compile-time constant.
K_CLOCK_SYSTEM corresponds to the the system clock. This is a monotonic non-decreasing non-wrapping 64-bit source with a zero at the instant of the zero of K_CLOCK_HARDWARE. A duration of 1 corresponds to a Zephyr tick. The rate Z_HZ_ticks is a compile-time constant, but the criteria for advancing the clock depends on CONFIG_TICKLESS_KERNEL. For a tickless kernel this source is a linear transformation scaling K_CLOCK_HARDWARE by Z_HZ_ticks / Z_HZ_cyc.
K_CLOCK_MONOTONIC is the documented clock used for the timeout API defined in #17155. It should be an alias for K_CLOCK_SYSTEM.
(TBD) K_CLOCK_UPTIME corresponds to k_uptime_get(). This is a linear transformation using K_CLOCK_MONOTONIC nominal tick rate to convert to a time scale where a duration of 1 corresponds to one millisecond. This probably isn't necessary, though it might be convenient.
(TBD) K_CLOCK_REALTIME would correspond to something that tracks civil time a la NTP. This is very TBD.

Let k_clock_id_t denote the type of the above K_CLOCK_FOO constants. Additional clock sources are possible by a Kconfig option to enable them for specific devices. This would extend the device API with functions provide frequency, gettime, and alarm capabilities for the provider. For example:

k_clock_id_t hires = counter_get_clock(dev);

would provide a clock id for timeouts that use that counter to control their duration, assuming the counter identified by dev was configured to support use as a clock source. Other non-counter providers may exist as well. Any clock id could be used in the following generic API:

u64_t k_clock_get(k_clock_id_t id);
u32_t k_clock_frequency(k_clock_id_t id);
k_timeout_t k_clock_convert(k_timeout_t t, k_clock_id id);

We then have:

k_clock_get(K_CLOCK_HARDWARE) == k_cycle_get()
k_clock_get(K_CLOCK_UPTIME) == k_uptime_get()
k_clock_get(K_CLOCK_SYSTEM) returns the tick counter (curr_tick)

Currently the opaque k_ticks_t value encodes a count in ticks that is either relative or absolute. In the future it should also encode the id of the clock to which it applies, so we should be able to use z_add_timeout and have that record on the queue for the corresponding clock.

We will also need API extensions:

Every K_TIMEOUT_suffix(args...) will have a corresponding K_CLOCK_TIMEOUT_suffix(id, args...) which implements it for clocks other than K_CLOCK_MONOTONIC.
All k_?s_to_ticks_suffix(args...) will have a corresponding k_?s_to_clock_ticks_suffix(id, args...) which implements the conversion between time and clock values/durations using the rate for a specific clock.
A new k_delay(k_timeout_t t) that extends the capability of k_sleep() to other clock sources.

For gPTP (generalized Precicion Time Protocol) enabled system, there is a high accuracy clock source, with nanosecond accuracy, provided by Ethernet controller. For example frdm-k64f (eth_mcux driver) and sam-e70-xplained (eth_sam_gmac driver) boards have support for gPTP. This clock is currently accessed only by gPTP code, which uses PTP driver for access (API at include/ptp_clock.h). I am not sure if this clock source would need to be accessed by other system than gPTP, so just fyi atm.

If that clock provides high-accuracy realtime (e.g. UTC or TA1) it could certainly be useful.

Some questions/comments:

Are we going to have alarms set on given clock source (including K_CLOCK_HARDWARE) and custom-clock sources? If so, who will be responsible for dealing with low-level constraints (example: I cannot set alarm earlier that X clock from now).
What we should do if given clock is not available (for example when system is going into low power state) but there is alarm set on that clock?
Are we going to inform the application about "clock lost" / "clock restored" event? Or rather application should "lock" given clock source to ensure that it will be ticking all the time?
Have you considered https://github.com/zephyrproject-rtos/zephyr/issues/14306 in the design? Ideally we should be able to 'move' wake-up event from one clock source to another.

I have not read through all of the related issues but one thing appears to be absent from this discussion, which is power consumption. I've been working with the SAMD21 and Zephyr recently and have discovered that the clock domain used for the system timers is the same clock used to clock the CPU. That is a problem for a low power application. To achieve low power on the SAMD21, the SAMD21 RTC needs to be clocked by the 32KHz lower power clock source. This implies another clock domain. Has anyone considered the implications of this with respect to this work?

This implies another clock domain. Has anyone considered the implications of this with respect to this work?

Yes. Initial thoughts, at least. It depends on the second TBD in #19282 relating to a domain model for clocks, which considers fixed rates, monotonicity, and the effect of discontinuities when mapping between domains.

@nordic-krch @nashif @cfriedt Guess this should get an RFC tag? @nashif tagging you as you maintain an RFC backlog...

For gPTP (generalized Precicion Time Protocol) enabled system, there is a high accuracy clock source [...] not sure if this clock source would need to be accessed by other system than gPTP, so just fyi atm.

I'd vote in favor. This is what Linux does under the notion of "dynamic clocks", see clock_getres(2)

@nordic-krch Some of the ideas in your Enhancement/RFC already seem to be implemented by now, zephyr/lib/posix/clock.c allows for access to (non-settable) CLOCK_MONOTONIC and (settable) CLOCK_REALTIME, currently both based on uptime_ticks(), ie. scaled in a POSIX compliant way similarly to what you propose as K_CLOCK_UPTIME/K_CLOCK_REALTIME. In principle I'd say these clocks could also be adjusted/compensated w/o discontinuities similarly to adjtime(3).

Would it make sense to try and map constants to those defined in POSIX and/or Linux, see clock_getres(2), e.g.

K_CLOCK_UPTIME -> K_CLOCK_MONOTONIC;
K_CLOCK_SYSTEM -> K_CLOCK_MONOTONIC_RAW;
K_CLOCK_HARDWARE -> K_CLOCK_MONOTONIC_COARSE?

(respecting the definitions of those clocks of course which is not 100% what you've defined)

This would simplify adding those to Zephyr's clock_gettime() implementation in a readable and obvious way.

~~@cfriedt Maybe it makes sense to close this one and continue discussion in #40099 (or the other way round). It's hard to discuss HF clock extension w/o also discussing multiple clocks.~~

UPDATE: Rejected for good reasons. :-)

@nordic-krch Some of the ideas in your Enhancement/RFC already seem to be implemented by now,

Ah, I see - this one should probably be assigned to me. Is it ok if I take this one, @nordic-krch ? It was originally created by Peter.

zephyr/lib/posix/clock.c allows for access to (non-settable) CLOCK_MONOTONIC and (settable) CLOCK_REALTIME, currently both based on uptime_ticks(), ie. scaled in a POSIX compliant way similarly to what you propose as K_CLOCK_UPTIME/K_CLOCK_REALTIME. In principle I'd say these clocks could also be adjusted/compensated w/o discontinuities similarly to adjtime(3).

This is very much a WIP area, as I'm literally tying in the pieces for CLOCK_REALTIME as I write this comment.

The monotonic clock should likely not be adjusted, but the realtime clock, certainly. For something like adjtime(3) to work, we would need a system-wide service.

Would it make sense to try and map constants to those defined in POSIX and/or Linux, see clock_getres(2), e.g.

K_CLOCK_UPTIME -> K_CLOCK_MONOTONIC;

K_CLOCK_SYSTEM -> K_CLOCK_MONOTONIC_RAW;

K_CLOCK_HARDWARE -> K_CLOCK_MONOTONIC_COARSE?

For now, I would suggest we stick to monotonic and real-time clocks for now. Advanced features will be added to the agenda as well.

This would simplify adding those to Zephyr's clock_gettime() implementation in a readable and obvious way.

Sort of - likely what should be done is that a separate but parallel Zephyr clock subsystem / library / whatever is needed, and then the posix layer should be a very thin layer that wraps around it. However, there are some potentially non-obvious things, like

when there are multiple clock sources, how do we choose which is monotonic, real, etc
- my take on this is to use DT chosen nodes
what if the system designer does not want to use the default system clock for the monotonic clock?
- my take on this is to use DT chosen nodes
how do we make a queryable interface for clock resolution?
- probably need to specify resolution via DT again

@cfriedt Maybe it makes sense to close this one and continue discussion in #40099 (or the other way round). It's hard to discuss HF clock extension w/o also discussing multiple clocks.

I'd prefer to leave this open as well, but thanks for commenting on this ticket again, because it's directly relevant.

The monotonic clock should likely not be adjusted,

@cfriedt: Then it should probably be K_CLOCK_MONOTONIC_RAW for compatibility with Linux? CLOCK_MONOTONIC is adjustable (but not settable) by default in POSIX if I'm not mistaken. See clock_getres(2): "The CLOCK_MONOTONIC clock is [...] affected by the incremental adjustments performed by adjtime(3) and NTP."

the posix layer should be a very thin layer that wraps around it

Yes exactly. That was what I had in mind.

my take on this is to use DT chosen nodes

Sounds reasonable. Would be great then to base this around some existing API - either counter.h or rtc.h (if tuned to support sub-second, tickless and low-latency) or both, so that existing drivers can be re-used?

I'd prefer to leave this open as well

Sure!

The monotonic clock should likely not be adjusted,

@cfriedt: Then it should probably be K_CLOCK_MONOTONIC_RAW for compatibility with Linux? CLOCK_MONOTONIC is adjustable (but not settable) by default in POSIX if I'm not mistaken. See clock_getres(2): "The CLOCK_MONOTONIC clock is [...] affected by the incremental adjustments performed by adjtime(3) and NTP."

My guess is that you're looking at Linux man pages.

Would be better to stick with POSIX via

https://pubs.opengroup.org/onlinepubs/9699919799/

Not 100% sure ATM if adjusting monotonic is part of POSIX. We could make it a kconfig option for a deviation though, and support it anyway.

If we were to support adjustments on monotonic on the Zephyr side, it would very likely need to be done outside of the existing k_uptime_get() machinery, but at least there is no spec to restrict that.

No longer maintained, see https://github.com/zephyrproject-rtos/zephyr/issues/76335 instead

My guess is that you're looking at Linux man pages

@cfriedt You're right. :-)

If we were to support adjustments on monotonic on the Zephyr side, it would very likely need to be done outside of the existing k_uptime_get() machinery

Exactly. Any type of drift compensation, calendar/leap second logic, algorithms to adjust clocks without discontinuity, etc. can be kept in the abstract timescale/clock layers (layers 3/4 below) as syntonization data + epoch offset in the same way that z_impl_clock_gettime() already does for the realtime offset.

Proposed Abstract Counter Syntonization, Timescale and Clocks Architecture

zephyr-abstract-clock-subsys

Note: The model is a conceptual overview and wip. Attribute-level API details, references to existing SoCs, peripherals, driver APIs and protocols are preliminary. Black color marks existing APIs, red color proposed new APIs.

All references to ktime_t correspond to the current concept of net_time_t which has been introduced into the net subsystem as a precursor of a generic nanosecond time representation throughout Zephyr.

Some comments on the proposed layering:

uptime counter layer: overflow protected uptime counters, optionally hybrid low-power implementation based on separate wake and sleep counter peripherals as thin wrappers around e.g.:
- sysclock: timer.c/timeout.c - default OS uptime ticks or cycles with additional overflow protection in case of 32-bit implementations (as sleep/awake uptime counter source, optionally as low-level timer/alarm source)
- any counter.h-comptible driver with additional overflow protection (as arbitrary precision sleep or awake uptime counter source independent from the kernel system clock, optionally as low-level alarm source)
- any rtc.h-compatible driver, usually tick of 1sec (as sleep uptime counter source)
- legacy RTC drivers or arbitrary hardware counter peripherals directly (see drivers/timer/nrf_rtc_timer.h, drivers/rtc/*, as sleep uptime counter source), access to CPU instruction counters
- not to be confused with Linux clock sources which are neither overflow protected nor hybrid counters optimized for low-power applications
- similar to the combination of a suspend and wake time clock source in Linux clocksource.c or the combination of the TSC/ART timers in Intel architectures. We do NOT bind to a single system clock, though, nor do we bind to a specific vendor's hardware.
syntonization layer: monotonic and continuous syntonized uptime references:
- This layer consists of two separate components:
  - hardware-assisted timestamping services that assign (sub-)ns-precison timestamps to specific ticks of a (class of) specific layer-1 uptime counters based on timing event sources (e.g. ETH or transceiver timestamps, PPS, PTP, NTP, RTC, GPS, ...) in regular intervals, triggered by inbound timing events (pulses, net packets) or on demand.
  - syntonization algorithms that discipline (drift compensate) local uptime counters and interpolate between subsequent timestamped counter ticks to maintain monotonicity and continuity of uptime references.
- Remarks:
  - One (class of) uptime counters can be timstamped by any number of compatible timing event services. As timestamping services require direct access to counter peripherals for precision timestamping, a single timing event service cannot be combined with several (classes of) uptime counters, though (ie. this is a one-to-many relationship).
  - The monotonicity and/or continuity requirements can be relieved for certain application areas to simplify implementation of uptime references. Sometimes it may be enough to just reset the reference on every timing event when continuity is not required.
  - some counter peripherals implement syntonization features in hardware (e.g. PTP-aware ethernet peripherals). In these cases, the uptime reference might be strongly coupled to a specific uptime counter.
  - separate system and network uptime references may be defined and chosen (via DT and/or Kconfig)
  - an optional low-latency timer multiplexing framework based on syntonized uptime (timer multiplexing similar to hrtimer) may be derived as a basis for high-level POSIX timers (see layers 3 and 4)
  - The system timeout.c/timer.c infrastructure could be made instantiable to derive multiplexed ns-precision uptime reference timers from uptime counters, see one attempt in #60400
- Examples:
  - The "syntonization part" of phc2sys from PTP
  - the Time Utilities API gives an example of a generic hardware-agnostic syntonization algorithm
  - Syntonization algorithms can be arbitrarily complex and must be pluggable as there is no "one best" syntonization algorithm, see D. Mills' "Kernel Model for Precision Timekeeping" (RFC 1589) as implemented in several *nix systems or a recent gPTP PR trying to introduce a PI synchronization controller
  - the Timing API could benefit from syntonized clocks for precise (e.g. PTP based) timings.
  - possible reference clocks in Zephyr are network reference clocks (gPTP, drift-compensated peripheral BLE ticker, TSCH TDMA, CSL) - similar to dynamic clock sources in Linux - or local PPS-type peripherals (rtc.h or a future pps.h for GPS, serial lines or even miniature local atomic clock peripherals)
  - synchronization of several counters will be required in the (g)PTP subsystem, when multiple PTP ports (different media, different peripherals, ...) of the same PTP instance need to share a common local clock with high precision, e.g. to implement PTP Relay Instances (IEEE 802.1AS-2020) or Boundary Clocks/Transparent Clocks (IEEE 1588).
abstract timescale/clock layer: timescale wrappers for the layer 2 system uptime reference source implementing a common timescale API:
- A timescale (or clock in the sense of this RFC) is a syntonized uptime reference additionally offset by a well-defined epoch, thereby definining a common "zero point" for time.
- Offsetting and adjusting the chosen layer 2 syntonized system uptime reference to appropriate timescales (MONOTONIC, REAL/UTC, TAI, ...) may be implemented as a collection of stateless and stateful utility algorithms. This is called "synchronization". While syntonization only guarantees that clocks tick at the same frequency, synchronization ensures that the common epoch (time offset) is kept the same across different representations of the same clock, i.e. they "show the same date and time".
- Offsetting is usually not persistent across power cycles.
- Power-on offsets may come from an external network source (PTP, NTP, ...) or from a local battery driven clock.
- adjustment and leap second "smearing" algorithms may be arbitrarily complex, examples: NTP, the "initial offseting" part of phc2sys from PTP, clock_adjtime(2)
additional POSIX / clib / calender / timezone clients that use timescaled clocks exposed by the clock subsystem to provide standard APIs for POSIX/libc clock access or implement high-level calendar and timezone support. The clock subsystem must provide a sufficiently capable default system uptime reference and a minimal choice of default timescales based on the kernel system clock that is guaranteed to be present on all Zephyr systems.

These architectural layers can be implemented one by one from low-level to high-level. Each layer will immediately provide value for specific applications without the higher levels being present. It is sufficient to provide basic algorithms for each layer initially as long as we ensure that the architecture is extensible enough to cater for more complex algorithms if needed later on.

Specifics of an "embedded" clock subsystem for Zephyr

Clock Diversity and Decentralization

Linux typically centers its notion of time around a single system clock (with only one or two underlying clock sources). Other notions of time are mostly derivatives that inherit basic properties of the system clock source (resolution, precision, energy consumption, continuity during low-power modes, etc). While independent ("dynamic") clocks and alternative synchronization approaches (PPS, PTP) may exist, they are rather hard to combine and synchronize unless referred to the common system clock.

Such a "centralized" approach does not seem right for an embedded real time OS where several clock stacks with diverse properties and distinct trade offs typically need to co-exist on an equal basis:

resolution/precision vs. low-power,
"hard" realtime vs. "soft" scheduling
different levels of clock interrupt priority
clock peripherals distributed over distinct power domains or across network interfaces
arbitrary collections of on-soc, on-board and remote clock peripherals need to be kept in sync or isolated
precise distributed clocks independent from the OS clock are at the core of embedded time sensitive / low-power networking and control applications

We propose an architecture where any number of independent clock stacks can be assembled from basic building blocks and co-exist on an equal basis. Any of these can be chosen to provide features of a traditional OS system clock. But the clock behind POSIX/libc APIs should no longer necessarily be the same as the clock behind kernel scheduling. Both should be configurable independently of each other at build time.

All configured clock stacks remain independently accessible, configurable and "synchronizable". There can be any number of concurrent clocks for subsystems like slotted automation control networks, BLE, 5G, PTP and TSCH. Whether and which of these clocks are synchronized and which should remain isolated should be constrained by configuration, not by the architecture.

Optimized for precision, constrained power, computation and memory resources

On an embedded system, offloading of timing and scheduling to dedicated peripherals is important:

Traditional OS clocks require CPU intervention on timing critical paths. Nanosecond precision clock syntonization and scheduling can only be achieved through specialized hardware, drivers and fixed-latency ISRs.
CPU and OS involvement must be minimized to save power, memory/stack and computing resources
Easy and frequent switching between different power modes must be supported
Combining configurability with modularity allows us to create highly diverse and specialized minimal firmware runtime bundles. Offloading of configuration and compute tasks to the build process keeps the firmware small and fast.

Modularization into basic building blocks is key

The basic idea of the architecture is to specify re-usable building blocks that can be combined among each other to form independent clock stacks. "Pluggability" exists at all levels: uptime counters, timing event services, syntonization strategies, derived clocks and user or library clients:

partial clock stacks can be configured and used if the application does not require a full clock stack (applications may require only low-power, overflow protected uptime counters, others need syntonization but no synchronization, many will not need POSIX/libc)
the same clock source can be re-used in several uptime counters (e.g. a single always-on clock can be used to maintain sleeptime continuity of any number of independent high resolution network interface timers)
re-usable uptime counters can be based on any number and combination of pre-existing drivers (rtc, counter, system clock, etc.) or directly on hardware peripherals
low-power continuity strategies (based on low-level peripherals or pps-style drivers) can be re-used across uptime counters
a single physical uptime counter can be logically syntonized/synchronized to different timing event sources
syntonization algorithms can be re-used across counters
any syntonized clock can be used as a basis for any number of higher-level timescales
the same timescale can be computed on distinct syntonized clocks
Zephyr-specific configuration, power management and debugging approaches are natively supported by all components.
We prune from the build what is not required by the application. No common infrastructure is needed.
Several variants of the same application can be built with different clock stacks and timing/hardware profiles based on build-time configuration (*.prj, devicetree, Kconfig, run-time assembled stacks).

Need for direct access to counter values underlying clock sources

For precise timing when dealing with low-energy peripherals, synchronized wireless protocols or synchronized real-time actors in distributed systems, it is often not acceptable to let the system CPU interfere with alarms or do the scheduling. That excludes ISR-based alarm/timer callbacks even if they are constant-latency. Pre-programming triggers with specific counter values in advance is usually required.

For the same reason it is usually not acceptable to work with approximate time values for hard realtime requirements. Time representations from all layers must be deterministically convertible to low-level counter values. All conversions must be implemented inside the clock subsystem to protect applications from conversion error or unintended abuse of "pseudo-precise" nanosecond timestamps.

The following use cases must be supported in the proposed architecture:

Applications requiring hard realtime or very high resolution timing must be able to deterministically pre-calculate precise (opaque) low-level counter values based on the well-defined nanosecond representation of time in syntonized or timescaled clocks and inject them into hardware in a driver-specific way for scheduled RX/TX. The radio timer (RAT) for example can then be programmed w/o IEEE 802.15.4/BLE L2 having to care about the details of conversion between external time sources, local low-energy counter and fast radio counter.
A cross-counter nano- or microsecond precision and overflow protected uptime abstraction above counter peripherals and convertible w/o loss to/from timescaled representations is required to convert between counters or relate low-level counter values to the reference clock w/o loss of precision (i.e. syntonization error). Nanosecond values encoded as int64_t (aka net_time_t) on level 2 (scalar syntonized uptime) or 64bit struct timespec on level 3 (timescaled values) are adequate for this purpose within the clock subsystem to avoid dependency on higher level concepts (e.g. POSIX timeval).

Examples of where such hardware support is required is timed RX/TX as in CSL (as used in Thread protocol) or #50336, see IEEE 802.15.4-2020, sections 6.12.2 and 6.5.4 plus other IEEE 802.15.4 features like synchronized PANs, RIT, and so on.

Case Study: TSCH TDMA protocol operation

TSCH is a TDMA protocol that defines cyclic timeslots at a fixed frequency (e.g. 10ms / 100Hz). Inside a timeslot high resolution timing is required to schedule TX packets and reception windows including ACKs at precise moments in time.

TSCH uses aspects of the slotted and time based architectures mentioned above.

The requirements are more specifically:

timeslot synchronization uptime counter:
- local clock syntonized to the global, distributed TSCH clock via one or more remote time synchronization neighbors
- (optionally) distinct from the system uptime clock to allow for concurrent NTP/PTP syntonization
- low frequency/low resolution
- always on
- low power,
- must not wrap
- requires a guard period to protect against late alarms similar to counter.h
high-resolution intra-timeslot radio counter:
- PPS-style syntonization with timeslot uptime at the beginning of each timeslot
- autonomous/non-adjusted local high-frequency oscillator (typically >= 1MHz)
- high frequency/high resolution
- may be stopped (sleep) and started (awake)
- may wrap, therefore requires active overflow protection,
- requires a guard period to protect against late alarms similar to counter.h
- provides pre-calculated counter values for hardware assisted scheduling of timed RX/TX
hybrid network subsystem uptime counter: The fast radio timer can be switched off to save power, so it must be resynchronized to the slower low-power clock when restarted. This requires hardware support: The timestamp of the fast radio timer must be captured at well defined edges of the low-power clock relating the two clocks with high precision and very low, deterministic jitter (see Linux PPS or BeagleBone GPIO PPS generator for comparison). The PPS pattern hides hardware-specific implementation details of synchronization, is vendor-agnostic and can therefore be re-used directly from L2 across L1 radio driver implementations. To target low-power devices, periodic PPS ticks are inadequate. A tickless (fetch-only) PPS is proposed for low-power systems. The two clocks together with access to a common PPS driver are the building blocks from which a high resolution, high precision, overflow protected hybrid radio uptime counter can be constructed.
syntonized TSCH uptime refererence as network subsystem clock: The TSCH time synchronization protocol reports phase deviations of the hybrid radio counter from timesource neighbours. Precise hardware assisted timestamping of incoming and outgoing radio packets is required for syntonization. As these timestamps will be captured related to the radio uptime counter, they may be used to discipline (syntonize) the radio uptime reference based on the radio uptime counter. Syntonization is based on algorithms that calculate some kind of statistic approximation based on each incoming/outgoing packet or keepalive message.

This same pattern of syntonized hybrid uptime counter and reference can be re-used with any timed token/cycled/slotted system like Profibus/Profinet or Sercos. These are often used in industrial real time environments (e.g. automation, robotics/motion control, etc.).

Potential for a system-wide "wall clock" in the presence of TDMA (slotted, cycled) protocols

Based on the proposed architecture the TSCH uptime reference (or Bluetooth uptime reference or that of any other TDMA/slotted protocol) could be made available as a "dynamic" distributed real-time reference w/o the need for higher level protocols (IP, NTP, PTP) or additional hardware (GPS modules or special ethernet cards).

The TDMA uptime reference can then be configured as the system-wide uptime reference with well defined precision and accuracy. To provide a distributed timescale an epoch plus other timescale parameters need to be agreed via an additional out-of-band channel (e.g. by exchanging a few proprietary network messages) between time neighbors.

The error (offset/jitter) of a TDMA uptime reference should be comparable to that of an NTP clock, AFAICS, certainly worse than PTP/GPS, but still good enough for many use cases. Some devices (especially those with ToF ranging capability) might be able to syntonize with much higher precision, maybe even in the range of (g)PTP which would enable PTP-style reference clock propagation across such wireless networks. In the TSCH case the clock syntonization hierarchy is similar to that of NTP strati. The closer a node to the PAN coordinator the "better" its accuracy (lower stratum).

Comparison with BLE air interface timing requirements

The concept of BLE's active clock (Vol 6, part B, section 4.2.1) is very similar to the requirements of intra-timeslot timing for TSCH.

The same similarity exists between BLE's sleep clock (ibid, section 4.2.2) and TSCH's timeslot synchronization clock.

Initial synchronization to a TSCH PAN is very similar to BLE's synchronization state and procedures (ibid, section 4.4.5).

The infrastructure to be developed SHOULD be fully compatible with Zephyr's existing BLE split controller's counter HAL and ticker.c including timer multiplexing and slot reservation, so that IEEE 802.15.4, gPTP and BLE subsystems can hopefully re-use the same basic precision timing framework for their respective scheduling purposes.

If more detailed comparison is required then s.o. from Zephyr's BLE team would have to chime into this discussion.

@cvinayak any thoughts on the above comment from @fgrandel (https://github.com/zephyrproject-rtos/zephyr/issues/19030#issuecomment-1597226731)

@cfriedt @bjarki-trackunit I quickly discussed the following questions with Bjarki over Discord but I think it makes sense to discuss them together here:

What was the rationale behind choosing an alarm (rather than timer) nomenclature and API that differs structurally from that of the system clock API? The problem I'm facing is that I'd love to use rtc.h and the emulated RTC as its default implementation in a much broader context (see my proposal above), but for that to work I have to make it tickless and sub-second.

Updates/Alarms could have been implemented using the system timer's model of relative and absolute times plus periods (just replacing ticks by structured timestamps):

Would reduce the learning curve as everyone could refer to their existing knowledge about the kernel system clock based timers. Choosing "alarm" (low level) rather than "timer" (based on alarms internally) is a bit confusing if comparing that to the kernel timer nomenclature.
Would make the zephyr,rtc-emul implementation a trivial and simpler wrapper around the system clock API which it uses internally anyway. Would also get rid of the unnecessary indirection via work queue (consuming less resources).
Would reduce the complexity of the API as the distinction between alarm and update (including different callbacks) would disappear naturally.
Would allow for sub-second and multi-second intervals while fully supporting the original "external one-tick-per-second" RTC use case.
Would allow for tickless timer callbacks in all cases. Waking the CPU up every second makes the current "update" interface rather problematic on battery powered devices - it defeats the purpose of a "low energy/always on RTC" in the first place.
Timers could be allocated in caller space which would be more flexible, easier to configure and we'd get rid of the pre-allocation via devicetree altogether.

It is a bit of a pity, that the API was not introduced with the experimental tag first. Is there any way we can add this tag after the fact until we're sure the API is really mature?

I can provide a PR with an evolved API for discussion if you think that I'm not totally off.

@fgrandel

It is important to note that the RTC device driver API is a low level interface for the capabilities of the hardware. Real-time clocks (RTCs) have alarms, not timers nor counters, hence why the RTC API doesn't implement a timer or counter API. The reason the RTC API, which matches the capabilities of an RTC quite well, does not fit your usecase is because an RTC fundamentally does not fit your usecase.

The RTC API is not meant to abstract access to system clock sources, the functionality you are proposing here is at least one layer above device driver APIs like the RTC and Counter APIs.

Although I remain skeptical of the merit of your concerns, you are welcome to create a PR proposing the changes you would like to implement, the RTC API is experimental after all (see API overview).

The RTC API is inspired from Linux RTC API which is quite mature, and is quite different from the Linux clock_gettime which closer represents what you are proposing here IMO.

@fgrandel - This API is mainly for hardware real-time clocks, and less for e.g. clock_gettime(CLOCK_REALTIME).

.. ah, I see Bjarki got to it already.

It is important to note that the RTC device driver API is a low level interface for the capabilities of the hardware.

@bjarki-trackunit @cfriedt Oh, I see your point and my fundamental misunderstanding. Thanks for clearing that up.

TLDR; rtc.h was moved to the category of "low level counter and alarm APIs" in the layered concept. rtc.h contains a certain amount of concepts from higher abstraction layers which should be extracted in a layered architecture. rtc_emul is a fake/mock driver for testing only and should therefore be moved to the test tree. The resource impact of rtc_emul in the CI/CD pipeline can be improved a bit.

Proposed action items to fix the above points:

~~update layered concept with the additional information provided by @bjarki-trackunit (@fgrandel, DONE)~~
~~mark CONFIG_RTC with EXPERIMENTAL (@fgrandel, DONE)~~
~~improve rtc-emul with spinlocks, make it tickless, only update rtc_time when requested/required (@fgrandel, DONE - except for making it tickless which is not worth the effort while it's only a mock)~~
discuss the architecture of the upper layer timer, clock and timescale abstractions (@cfriedt @fgrandel, WIP - see above)
once we agree upon an architecture focus rtc.h on its role as a RTC HW driver and move the rest to higher-layer or hardware-agnostic abstractions (@fgrandel, TODO)
if rtc_emul turns out to only be useful as a testing tool, then move it as a fake driver to the test tree as in tree it provides no value over a generic full featured clock subsystem instance based directly on kernel primitives (@fgrandel, OBSOLETE)

Issues:

59900
59901

PR:

59902

Supporting arguments follow:

The RTC API is inspired from Linux RTC API which is quite mature

What is meant by "focusing rtc.h" on its role as HW driver:

rtc(4) is a POSIX ioctl()/poll() userland API, not a HW driver API.
Linux provides a simple HW driver API a reduced version of which we could use almost unchanged as rtc.h w/o loss of functionality except for what I believe would not belong in such a HW driver (see below): easier to implement and better recognized by people acquainted with Linux.
What is now called calibration might need to support more elaborate PLL tuning features in the future?
On top of a simplified driver API we may then provide the original POSIX API iif POSIX is enabled.

In a layered architecture this reduces the long term effort for RTC driver development and maintenance by removing higher level and non-related concepts from rtc.h:

Subsecond support except for intervals is the domain of high resolution timers that require a different API for more precision. For the same reason rtc(4) does not have it either plus uses hrtimer internally for subsecond periodic interrupts as optional per-instance feature iif supported by hardware.
On Linux rtc(4) does not support multiple alarms - probably for historical reasons - but might contribute value iif HW supports it natively.

Otherwise there is probably more value in a higher-level re-usable real-time (optionally high-precision) multiplexed timer API as proposed above with several HW driver APIs (clock sources) as backend of which rtc.h is only one. Higher-level timers with an RTC hardware driver backend can make use of HW supported alarms iif available and fall back to a software implementation otherwise plus they can build upon higher-level concepts like compensated clocks and timescales which rtc.h hardware knows nothing about. A full set of clock/timescale/timer features will be available in userland with less effort for driver maintainers even if not natively supported by RTC hardware or with no RTC hardware present at all.

Also related to multiplexed timers: If alarms are only available as supported in hardware then alarm resources can be allocated from the stack and tracked in a list of timers inside the timer abstraction with timer resources statically allocated in caller space to reduce complexity and resource usage.

the RTC API is experimental after all

RTC is documented as experimental but it is lacking the practical enforcement via EXPERIMENTAL flag in KConfig. We should add and backport that.

@fgrandel

I would suggest you focus this issue on what you actually want to achieve, instead of writing all you grievances in lists in comments. Its hard to follow and respond to concisely, leading to repetitions of answers and lack of focus.

Put simply, The RTC and Counter APIs are the lowest level APIs for the hardware (as already stated). For this reason, there is an inherent contradiction and possible lack of understanding when you state:

strip down rtc.h to a pure RTC HW driver and move the rest to higher-layer or hardware-agnostic abstractions This is already the case. There is nothing to strip back or move here.

Higher level abstractions are just that, higher level, built upon the lower levels.

Please read the documentation regarding API API Status and Guidelines, specifically the API Lifetime section, then look at a few of the existing RTC device drivers RTC Drivers to gain an understanding of how the API is used and what the hardware capabilities are. Lastly, look into what the rtc-emul driver is actually used for Hint

This should address most of your points, and help focus this issue.

I would suggest you focus this issue on what you actually want to achieve

@bjarki-trackunit You are right, the rtc.h discussion belongs in separate issues (I opened some as you've seen). Unfortunately it landed here because of my initial misunderstanding about the role of rtc.h. With hindsight the error is obvious.

UPDATE: rtc.h-related discussion entries which turned out to be irrelevant wrt the clock subsystem architecture were hidden from this RFC. If you wish you can hide yours as well then focus of this RFC will be completely restored.

Closed in favor of #76335 which summarizes results from this and prior discussions for easier reference and visibility. Feel free to re-open if you think that this RFC still has merit on its own.

zephyrproject-rtos / zephyr

[RFC] Abstract Clock Subsystem Architecture #19030