Add support to transport IEEE 802.15.4 PPDU + Padding between drivers instances and L2 - MAC

Is your enhancement proposal related to a problem? Please describe.

The Zephyr IEEE 802.15.4 stack (L2 - MAC) implementation don't transport all information defined by the IEEE 802.15.4 PSDU (PHY Service Data Unit) packet. As identified and discussed on #49311 the missing fields obligate majority (if not all) drivers to do special treatment on RX/TX operations. This intents to collect and discuss requirements to solve in a proper way this issue.

Describe the solution you'd like

Add two characteristics: 1 - add capability to transport bigger buffers between IEEE 802.15.4 drivers and L2 - MAC layer; 2 - add a valid IEEE 802.15.4 MAC frame size. The buffer must be capable to accommodate the IEEE 802.15.4 PPDU (PHY Protocol Data Unit) plus any additional octet required by the driver. This provide a way to tune the driver buffers allowing best performance. The driver should expose a mechanism to inform where the PHY PSDU octet starts and what is the aMaxPhyPacketSize value (PSDU max length). All this information should be defined by each driver instance. With that, any above layer should be capable to adjust itself inside PSDU size with best data positioning. These consults between driver and the above layer are required to enable multiple drivers instances which supports different PHY standards. A mechanism could be implemented to use constant values for optimal performance on L2 and/or driver only if when using one driver instance or the instances share the same PSDU length.

This means that a new API is necessary to expose, but not limited to:

phy_max_buf_size It is PPDU size plus padding. The default number of octets is 128 represented by SHR(0) + PHR(1) + PSDU(127) + padding(0).
phy_psdu_position 1 octet SHR(0) + PHR(1).
phy_max_psdu_size It is the PSDU size itself, 127 octets PSDU(127). The PSDU is 2047 for the following PHYs: SUN, TVWS, RCC, LECIM FSK, and MSK with a 2000 kb/s data rate. For LECIM DSSS PHY, this is not a constant; refer to phyLecimDsssPsduSize [which are 16, 24, 32]. 127 for all other PHYs.

The driver is responsible to pass correct value to upper layer once all variables may differ their values. This means that depending on the combination of PHY x Channel x Page the SHR, PHR, PSDU, FCS and padding may be different and the MAC should operate independent of those values. See the different SHR sizes based on the tables at additional content for more details.

Pros

Remove drivers workarounds on received frames since OpenThread and Zigbee expects MFR and Zephyr 802.15.4 not
Remove drivers workaround on transmission since majority expects PHR with value of MAC frame including MFR
Easy to manipulate SHR, PHR or any other driver frame requirement
Allow introduce padding to easy use DMA
Easy to integrate with a custom upper layers
Help to introduce IEEE 802.15.4-2015 PHYs like SUN

Cons

Introduce new API
May increase RAM usage depending on driver needs.

Additional context

The below IEEE 802.15.4 tables may help to understand and visualize the IEEE 802.15.4 generic frame and how data is positioned inside it.

IEEE 802.15.4-2020 PSDU format

IEEE 802.15.4-2020 PPDU format

IEEE 802.15.4-2006 SHR Sizes (Preamble + SFD)

Reference

IEEE 802.15.4-2020

I understand we need to straighten the situation with FCS field, as it's part of PSDU. But do I understand correctly that you also propose that SHR and PHR should be a part of the frame (net_pkt) exchanged between L2 (MAC) and driver? Shouldn't it be driver responsibility to fill out those fields (and trim them accordingly on the RX side), just as it'd done right now in the drivers? @fgrandel Do you know if PHY headers (SHR and PHR ) are expected to be also part of the frame received over raw 802.15.4 socket?

I wonder if it wouldn't be easier to allow the drivers that support different PHY modes to simply be configured into specific mode with the existing configure radio API (https://github.com/zephyrproject-rtos/zephyr/blob/main/include/zephyr/net/ieee802154_radio.h#L338), and let the drivers deal with the PHY details.

CC @hubertmis in case you have any input.

If I understand correctly this proposal is about adding placeholders in the buffer transferred between L2 and a driver: before and after actual PSDU to accommodate SHR, PHR, MFR (actually being a part of PSDU). There is no doubt MHR and MSDU should be transferred between these layers, but the question is about other parts of the frame.

In a typical network stack we would use a structure like Zephyr's net_pkt. When a packet to send is created, the app layer allocates a buffer and fills it with payload. The app layer passes the buffer to a lower layer. The lower layer prepends this buffer with a header, and passes to a lower layer... The prepending might cause fragmentation in net_pkt structure so that the frame to be transmitted is scattered across a network buffer. I think similar approach should be applied for L2 -> ieee802154 driver communication: L2 passes a net_pkt consisting fragments of MHR+MSDU to be transmitted, and if the device requires other headers or footers, it is the driver's responsibility to prepend the frame with SHR, PHR, add MFR, etc. When all necessary parts of the frame are present in the memory, the radio's DMA should run scatter-gather across all fragments of the net_pkt while transmitting the frame.

However, many 802.15.4 radios do not support scatter-gather DMA, so the rule described above is not applicable.

That's why I think we have two options:

If given 802.15.4 radio does not support scatter-gather DMA, scatter-gather algorithm should be implemented in the driver code (performance and memory overhead) - it looks this is the approach we have in the Zephyr drivers now.
We need to ensure in the "L2 - driver" API that the fragment to be transmitted has enough space before and after MHR+MSDU to accommodate the radio's needs.

I understand this proposal is about the 2nd option.

It looks the proposed solution should increase performance with non-scatter-gather radios, because it would allow to introduce zero-copy model in the drivers for such radios. On the other hand it creates an overhead in all L2s using ieee802154 radio, what is unnecessary for radios supporting scatter-gather.

Taking all this into account I tend to agree with @rlubos to clearly define frame content between L2 and the driver (is MFR included or not?) and to leave to the particular driver dealing with SHR, PHR, MFR if it needs to do so.

Anyway, I think MTU getter (phy_max_psdu_size) would be valuable for L2s to run fragmentation algorithm correctly for given radio.

@fgrandel Do you know if PHY headers (SHR and PHR ) are expected to be also part of the frame received over raw 802.15.4 socket?

@rlubos : I'm taking Linux's IEEE 802.15.4 socket gluecode as an example (which we probably should to not cause confusion to people coming from Linux userspace).

PHY SHR/PHR

I can assert that IEEE 802.15.4 PHY headers are never sent to or from user space. Not even the kernel has access to PHY headers so much less the socket API. Most radio hardware will not even support access to these headers including all of the currently provided Linux low-level device drivers.

What must be supported once we enter 802.15.4-2015 land (or later) is configuration of the radio driver to support different PHY header formats (as well as many other aspects of the new PHYs).

MAC MFR

As far as MFR is concerned, it MUST NOT be added on transmission but it will ALWAYS be included on reception of RAW packets, see here for userspace examples: https://github.com/linux-wpan/wpan-tools/blob/master/examples/af_packet_tx.c https://github.com/linux-wpan/wpan-tools/blob/master/examples/af_packet_rx.c

This can also be confirmed from the current kernel source: https://elixir.bootlin.com/linux/latest/source/net/mac802154/tx.c#L57 - FCS added iff the radio hw does not add it itself https://elixir.bootlin.com/linux/latest/source/net/mac802154/rx.c#L262 - FCS added iff the radio hw omits it

Linux currently does not support 4-byte MFRs as required by some (but not all) IEEE 802.15.4-2015 PHYs, FCS length is hard coded to 2 bytes: see https://elixir.bootlin.com/linux/v5.19.6/source/include/linux/ieee802154.h#L21

Reservation of headroom for PHY and/or MAC headers

Here the Linux kernel seems to vote for @hubertmis ' option 2) wrt headroom reservation combined with generic scatter/gather support via the usual msghdr (option 1), which drivers will have to collect themselves if their hardware only supports linear memory access (which is often the case as @hubertmis mentioned):

hlen = LL_RESERVED_SPACE(dev);
...
skb_reserve(skb, hlen);

https://elixir.bootlin.com/linux/v5.19.6/source/net/ieee802154/socket.c#L275

As you can see, the MAC layer doesn't have to know much about what is actually needed as it is flexibly hidden behind a constant, which resolves to:

#define LL_RESERVED_SPACE(dev) ((((dev)->hard_header_len+(dev)->needed_headroom)&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
...
#define IEEE802154_ACK_PSDU_LEN 5
#define IEEE802154_FCS_LEN      2
#define IEEE802154_MAX_HEADER_LEN   (2 + 1 + 20 + 14)
#define IEEE802154_MIN_HEADER_LEN   (IEEE802154_ACK_PSDU_LEN - IEEE802154_FCS_LEN)
...
hw_driver->extra_tx_headroom = ... /* anything the hw driver requires - potentially including PHY header */
net_device->hard_header_len = IEEE802154_MIN_HEADER_LEN - 1;
net_device->needed_headroom = hw_driver->extra_tx_headroom + IEEE802154_MAX_HEADER_LEN;

The relevant sources are: https://elixir.bootlin.com/linux/v5.19.6/source/include/linux/ieee802154.h#L35 https://elixir.bootlin.com/linux/v5.19.6/source/include/linux/netdevice.h#L283 https://elixir.bootlin.com/linux/v5.19.6/source/net/mac802154/iface.c#L533 https://elixir.bootlin.com/linux/v5.19.6/source/net/mac802154/iface.c#L629 https://elixir.bootlin.com/linux/v5.19.6/source/net/mac802154/iface.c#L537

It even seems like the Linux kernel has a (benign) bug here as headroom = ... + IEEE802154_MIN_HEADER_LEN + IEEE802154_MAX_HEADER_LEN + ... doesn't really look like a sensible definition. ;-)

L2 "MTU" (aka HW MTU, see #4934)

While doing the research for this response I discovered that Linux uses the MTU constant differently from what I defined in my recently merged PR. :-( This is unfortunate and should probably be fixed before people start using the new constant - to keep Linux and Zephyr in sync wrt headers:

#define IEEE802154_MTU          127
#define IEEE802154_FCS_LEN      2

See https://elixir.bootlin.com/linux/v5.19.6/source/include/linux/ieee802154.h#L21.

If you agree on that point I'll provide a fix quickly, so that we don't cause confusion to people coming from Linux.

BTW: Calling something in L2 "MTU" is still a misnomer IMHO, but that's the name of the constant in Linux's L2 header as well as in Zephyr. Calling this HW MTU as opposed to L3 MTU makes the distinction sufficiently clear, I think.

In Linux HW MTU is exposed to upper layers as:

dev->mtu    = IEEE802154_MTU - IEEE802154_FCS_LEN - dev->hard_header_len;

The components of which have been mentioned and referenced above already.

See https://elixir.bootlin.com/linux/v5.19.6/source/net/mac802154/iface.c#L548

Application to Zephyr

While Linux is certainly not the holy grail, I vote for sticking as closely as possible to the Linux de-facto standard following the principle of least surprise while also not introducing any complexity that we do not need right now.

Following these principles, my vote is to:

Stick as much as possible to the socket API (RAW and DGRAM) that Linux exposes - except for Zephyr-specific differences that have been introduced earlier and cannot be undone w/o a breaking change in user land.
Not expose PHY header content anywhere as this breaks encapsulation and will not be supported by most of the radio hardware anyway.
Expose FCS content inside L2 (if supported by radio hardware) on transmission and reception and outside L2 on reception only - adding a synthetic FCS in L2 on packet reception for drivers that don't support passing the actual FCS to L2. This implies that Zephyr drivers will have to provide a standard API to inform Zephyr's L2 about their FCS handling on tx and rx, respectively.
I vote against supporting 4-byte FCS and PSDU > 127 bytes right now. IMO this requires a lot of unnecessary complexity and should only be introduced when actually needed by some Zephyr driver. This doesn't mean that we shouldn't keep such an extension in mind but only as much as is needed to keep the option available for the future.
We may introduce support for headroom reservation and scatter/gather (which is currently not supported by Zephyr's IEEE 802.15.4 L2) gradually along the lines of Linux if needed (and only then). I do not have a strong opinion on whether this makes sense now. IMHO it smells a little bit like premature optimization unless someone actually requires it - which I don't.
If (and only if) headroom reservation should be done then I strongly support Linux's approach to expose drivers' headroom requirements via generic radio hw API in order for L2 to remain completely unaware of any driver-specific implementation details.
It would be great if zephyr's driver API exposed some generic way to set and read PHY PIB attributes (e.g. to configure SHM and PHR formats) but also for many other variables and constants that become more and more important once we start entering 802.15.2-2015 territory (or later). Ad hoc approaches away from the spec will quickly get out of hand IMHO as there is lots of configuration coming up on PHY level. This includes all the data required by @nandojve 's proposal. This is in line with @rlubos 's proposal to write PHY PIB attributes via the existing configure() plus adding an attr_get() method to the radio API, see the detailed proposal in https://github.com/zephyrproject-rtos/zephyr/issues/50336#issuecomment-1251122582.

Just stumbled upon the discussion in #4934 which seems to be relevant to the usage of the term "MTU" in different contexts in Zephyr. Just to make sure we take into account what's already been discussed/decided.

After reading your comments and looking inside stack I think it is possible achieve the goals without penalize current implementations. Probably I expressed myself not well on few points but there is no intention to have any PHY treatment inside L2. The goal is to have space reservation,

I prefer not look on the direction of there is no one using right now. Instead, I'm looking to do preparations to enable anyone to implement missing features by fixing and improving the IEEE 802.15.4 stack itself. Since this are fixing and improvements it should not penalize current implementations.

I don't think it is possible to have all the improvements only with one PR. It will be necessary a few PRs where we can talk about specific topics like MTU.

I decided to close this since I don't have all Linux Network expertise and I don't have the necessary time to study it right now.

@nandojve Could you please re-open and assign to me? Not all but most of this will be implemented within #50336 which I'll take care of.

@nandojve I now started adding FCS on RX everywhere and removing the corresponding workarounds in the drivers. It seems, though, that some drivers did never properly implement adding FCS - not even when serving OpenThread/Raw. Another problem seems to be that some drivers do not produce FCS on ACK packages. I marked some places I found with TODO tags. This change will be extracted as one of the PRs that precede TSCH introduction. It will therefore take some time until you can review the PR. You can have a preview, though, by looking at my WIP branch, see the driver changes there.

Why not having a runtime configuration boolean, set to false by default, that let whatever L2 to request FCS inclusion in rx pkt or not (add a new element to enum ieee802154_config_type)? That will remove the current defines for OT & raw, as well as keeping the FCS away from the native L2.

Why not having a runtime configuration boolean, set to false by default, that let whatever L2 to request FCS inclusion in rx pkt or not (add a new element to enum ieee802154_config_type)? That will remove the current defines for OT & raw, as well as keeping the FCS away from the native L2.

Yes, another good solution (which I'd agree to also implement iff it becomes the preferred solution).

But IMO even better: The driver always allocates space for the FCS in the buffer but communicates via HW capability whether it delivers an actual FCS on RX or not.

Rationale:

No matter what, space for the FCS must be reserved by all drivers when allocating packets. Space is allocated statically for the forseeable future so this is the case for @tbursztyka 's approach as well. No regression or difference between the proposals here.

I consider the HW capability a boolean in the opposite direction but it is static rather than dynamic. The FCS should be included in RX frames whenever drivers have the FCS in the buffer anyway (otherwise they can just include free space for it in the buffer). In these cases the FCS is there for free (wrt driver complexity, memory and CPU). As the requirement to include FCS can change from packet to packet (e.g. posix raw packets vs. IP packets) dynamic configuration is not even possible. If (and only if) L2 needs an FCS it must synthetically generate one if its not included by the driver - this is just proper separation of concerns.

IMO we do not have a requirement to choose at runtime whether to get FCS from the HW or not. This is mostly fixed by the chipset anyway. If the (improbable) case comes up in the future that some HW driver wants to allow FCS configuration itself then I propose a driver-specific KConfig or devicetree option that determines the driver's behavior. If it is devicetree, this can be part of a cross-driver base template - which makes sense anyway (see @cfriedt 's comment on Discord).

This solution has the same advantages as the one proposed by @tbursztyka :

We do not waste CPU cycles unless it is really needed for a specific use case.
The radio API clearly communicates that different treatments of FCS exist and drivers are expected to document their behavior.

But it also has additional advantages:

It caters for per-packet requirements.
It maintains better L2 vs. driver encapsulation (separation of concerns).
It reduces complexity/redundancy/inconsistencies in drivers even more (which is why we started discussing this topic in the first place). The current state of the drivers proves that this is necessary - several of them are broken right now wrt either the native stack or OT due to FCS inconsistencies.
Principle of least surprise: This is the solution implemented by Linux - mostly for the same reasons as outlined here.

@nandojve @rlubos - heads up to you as well, guess you have an opinion on this?

The driver communicates via HW capability whether it delivers an FCS on RX or not.

I think I like this approach better, less fuss in the drivers wrt satisfying L2 needs.

IMHO FCS is always part of L2 because it is the footer of MAC itself, see IEEE 802.15.4-2020 PSDU format picture. Keep it away is the motive to have a lot of of misunderstanding and workarounds which motivate the discussion.

Not having MFR available at L2 creates implementations problems, as we already know. It may create non aligned buffers and it limits implementation when someone wants to extend current standard with a new proposal for instance (for sure this is a corner case but it is a valid one and a good implementation should not limit applications). It limits how stack will handle newer PHYs with different MFR sizes.

Keep in mind that all devices are developed with standard PSDU in mind. The standard was designed to L2 receive MFR info at RX and to generate MFR on TX always. My conclusion is that it is far more ease have those bytes there them try to optimize L2 to not have it (when the implementation is a generic solution like Zephyr IEEE 802.15.4 stack).

It is better have MFR always considered in the interface between driver and L2 because follow the standard IEEE 802.15.4 interface. Having few bytes available does not means that L2 should process that information all the time. It means that stack is comply with the standard interface and avoid problems. Example: when a HW is well designed L2 can be optimized to not process/generate MFR but the size of MFR still considered in the frame. This happens because it is a standard interface. As far I remember, all devices expecting a frame size which includes MFR size on it.

That said, if MFR info is always part of the interface: 1) L2 can decide on RX to process/skip MFR or even generate it because space is already there. This can be defined by Kconfig options which will be a selection of #define methods in the stack. This means very clean implementation. It allows different drivers be used at same time with/without MFR. 2) L2 on TX can skip or generate MFR calculation since there are space. Size will be always passed correct to any driver. Again, a Kconfig option can generate MFR if necessary and it will be a skip on code with a #define method! 3) L2 uses MTU to know how much space is available in the TX buffers. 4) Driver is free to use any headroom/tailroom space to optimize package handler with/without DMA.

In addition:

N1) headroom is a devicetree property. N2) buffer size used between driver and L2 is defined at devicetree. N3) L2 should use a headroom to determine where is the start byte of MHR. N4) L2 buffers MUST NEVER use MTU as part of buffer allocation. This not works on the long term. The correct solution should be determined by [N2]. L2 uses devicetree macros to scan all IEEE 802.15.4 drivers and select worst case value. This happens because stack should allow more the one driver at same time with different PHY needs. Any attempt to have a different solution will create overheated or need to be re-write. N5) tailroom is a value that only concerns to driver. It can be calculated as buffers size - aMaxPhyPacketSize - headroom

I'll not oppose to have different solutions but I think this is the easy to follow code and best approach in the long term.

RX:

@nandojve You have a point here - and I added that as a clarification to my proposal: No matter how the HW driver deals with FCS, it needs to pre-allocate space for the FCS in the RX buffer. Even if we follow @tbursztyka 's approach this remains true as drivers will allocate buffer space statically for the foreseeable future.

All other things remain equal, though. If I understand you correctly, you do not question that we do NOT want to waste real resources (CPU, battery, latency, etc.) to calculate a CRC for purely academic reasons.

So I believe your argument is in line with both, my proposal and that of @tbursztyka .

TX:

The new input only concerned the RX side. So not sure why you're bringing all that TX stuff up again unless you changed your mind and want to implement it yourself now? To me the latest proposal looks like a step back as it contradicts some of the good arguments and ideas that were brought forward by others before. I really don't want to invest more time in a discussion unless someone commits to implementing that stuff.

@fgrandel ,

My argument is valid for both directions. You are reworking FCS and not considering MFR for TX side. In the case of RX side it is clear now that MFR should be present all the time by my understanding.

I'm not bring old discussions again. I'm trying to explain my point that MFR must be present in both directions.

Maybe it is difficult to understand my point because sometimes content have been used to provide meaning. Remember that FCS content is inside MFR (MAC Foot Header) and MFR is part of MAC.

No matter how the HW driver deals with FCS, it needs to pre-allocate space for the FCS in the RX buffer.

I would write this way: No matter how the HW driver or L2 deals with FCS, it must allocate space for the MFR in the RX buffer and the same way L2 must allocate space for MFR on TX buffers. All devices that implement IEEE 802.15.4 follows the standard interface and that solves a lot of headache. If us continue to ignore this fact we will need to deal with this problem in future again because it imply how PSDU size is calculated.

The new input only concerned the RX side. So not sure why you're bringing all that TX stuff up again unless you changed your mind and want to implement it yourself now? To me the latest proposal looks like a step back as it contradicts some of the good arguments and ideas that were brought forward by others before. I really don't want to invest more time in a discussion unless someone commits to implementing that stuff.

I think in this regards you are mixing topics. My point is very clear about the technical debit, which is MFR for both directions. This will imply that L2 buffer allocation should be correctly fixed when introducing head/tailroom and it seems you started to touch it.

For instance the below fragment, if I understood correctly, it is not proper. It shows few misunderstandings:

        if (headroom + buf->len + tailroom + IEEE802154_FCS_LENGTH > IEEE802154_MTU) {
            NET_ERR("Frame too long: %d", buf->len);
            return -EINVAL;
        }

You change the meaning of ll_hdr to headroom. ll_hdr is about MHR and doesn't have nothing to do with headroom on my view. The same will be with tailroom which should have a different meaning.

1) headroom should be available before PSDU region. With that, driver will have space to fill any necessary data, like SHR and PHR, no matter what PHY is in use. It is up to driver defined what they want to do with the headroom. L2 just make sure it will start to write in the correct position of MHR, skipping headroom bytes in the TX buffer. The skipping makes counter already have correct size for the current PSDU. In the current snip a valid IEEE frame with aMaxPhyPacketSize size became invalid or it will not be produced, which is a wrong concept on my view. 2) tailroom should be available after PSDU region. In the same way driver can fill tailroom spaces and not touch in the L2 stack content anyhow. Again, the above check not allow to produce a valid aMaxPhyPacketSize frame if tailroom is required. 3) If checks looks headroom + IEEE802154_MTU + tailroom it will be clear that L2 buffer allocation can not be a function of MTU and FCS when PSDU is equal to aMaxPhyPacketSize. Why? Because when you use a headroom/tailroom you are virtually increasing the HW MTU somehow. I don't want change IEEE802154_MTU meaning here, for me it will be always aMaxPhyPacketSize.

The above just show why in IMHO it is important have an agreement in regards of MFR transport between driver and L2 in both directions and make clear to everyone that L2 buffer allocation should be defined in the devicetree to solve the debit. The way driver and L2 will deal with it is out of scope. Once L2 buffer allocation is in devicetree it will be easy to implement headroom/tailroom.

Changes related to this enhancement request were now decoupled from TSCH (#50336) development as - after further discussion - it turns out that considerable additional architectural changes will be required to support differentiated FCS handling:

The socket API currently does not provide a hook on the RX path for RAW packets to be handled by L2 before delivering them to the application. This is however required to add a synthetic FCS in case the HW driver does not provide one. It is also required to parse LL address for POSIX compliance (see packet(7): "When receiving a packet, the address is still parsed and passed in a standard sockaddr_ll address structure.").
It is unclear how the HW drivers can be unified (which is one of the very reasons this enhancement was proposed) unless OpenThread L2 also is able to deal with drivers that do not provide FCS. As has been pointed out above, the converse, i.e. providing an FCS in all cases is not an option as in almost all use cases it unnecessarily burns a non-negligible amount of CPU cycles.

This enhancement will be developed in isolation in https://github.com/zephyrproject-rtos/zephyr/compare/main...fgrandel:zephyr:feat/unified-fcs-handling from now on.

zephyrproject-rtos / zephyr