Open PetervdPerk-NXP opened 2 years ago
Nice diagram!
PR #6834 is trying to address this, however on system with mixed interfaces utilizing the NET stack & IOB. For example
Interface MTU Ethernet 1518 WIFI 576 CAN2.0B 13 When adding the DMA Descriptor to the IOB as proposed in #6834 each IOB in the case of IMXRT would by grow by 29bytes.
Before we can adjust IOB buffer size, let all netdev share a biggest header isn't a major concern. Also, the ratio of header space consumption is small if we compare it to the normal MTU size. Does it worth us to compilate the design(benefit is ~5% in most case).
I can think of some solutions but I'm not sure what would be best:
- Make the IOB buffer peripheral specific, hence for the example above we will get 3 separate IOB buffer pools
Yes, it's one solution but how about layer 2 forward(e.g. RNDIS<->Ethernet/WiFi/Modem) which is a feature we plan to add. To achieve the zero copy through the whole path, it require us to reserve the biggest header size and share the same allocator.
- Create a 2nd DMA aware IOB
iob_dma
that has metadata of the DMA descriptor and pheripheral. Where the Network stack can invoke callbacks to the pheripheral to parse the received metadata (also nice to check for HW offloading and then fallback to the SW method)
The special handle can be achieved before netdev pass IOB to TCP/IP stack in receiving direction. The similar thing can be done in the transmit direction too, why we need callback here?
- Create IOB buffer pool that supports variable data length, a good example could be https://github.com/pavel-kirienko/o1heap
could be a solution, but we need consider that:
Another solution is make IOB buffer smaller than MTU(e.g. 256 instead 1518) and utilize the link list DMA in hardware.
Another thing I want to address is to support the HW offloading bits, there are ethernet MAC's that have IP offloading features such as checksum calculation done by MAC and simply indicate using a bit that the CRC in the DMA descriptor, or indicate that an automatic ARP response has been send.
HW checksum is already supported, you can simply enable NET_ARCH_CHKSUM: https://github.com/apache/incubator-nuttx/blob/master/net/utils/Kconfig#L6-L26
The special handle can be achieved before netdev pass IOB to TCP/IP stack in receiving direction. The similar thing can be done in the transmit direction too, why we need callback here?
Te be able to decode the eth_desc_s which contains extra information, such as packet type, timestamp, CRC, MAC filter, VLAN, Error only the MAC driver knows what this means. https://github.com/apache/incubator-nuttx/blob/5d12e350da31324e632ee7da3a3062b05212b74c/arch/arm/src/s32k3xx/hardware/s32k3xx_emac.h#L3076-L3082
HW checksum is already supported, you can simply enable NET_ARCH_CHKSUM: https://github.com/apache/incubator-nuttx/blob/master/net/utils/Kconfig#L6-L26
That API is to pass the data pointer to an Hardware CRC accelerator, in the case of the S32K3XX this doesn't work because the MAC itself has a build-in CRC checker that indicates that in the eth_desc_s IPCE mask when the packet has been received.
#define EMAC_RDES1_IPCE_MASK (0x00000080u) /* IP Payload Error bit
* IP payload checksum (that is, the TCP, UDP, or ICMP checksum)
* calculated by the MAC does not match the corresponding checksum
* field in the received segment. */
could be a solution, but we need consider that:
- Since it is impossible to forecast the incoming packet size, we have allocate the full packet in the receiving side
- Need consider the fragmentation problem and the house keeping overhead
- How to accumulate the transmit data to form the full packet without reallocate/copy
There really not a 1 solution fits all, but I want to make sure though we've got something can easily be adapted to support all MAC controllers with DMA in NuttX.
So if some authors of ethernet MAC drivers could share their thoughts if there's a compatible solution that can be used their driver.
@davids5 @gregory-nutt @acassis
The special handle can be achieved before netdev pass IOB to TCP/IP stack in receiving direction. The similar thing can be done in the transmit direction too, why we need callback here?
Te be able to decode the eth_desc_s which contains extra information, such as packet type, timestamp, CRC, MAC filter, VLAN, Error only the MAC driver knows what this means.
Yes, I understand that the MAC layer need additional descriptor before the IP packet. But, the process can directly handle in either irq handler and work callback before pass the data to TCP/IP stack. What I can't understand is why we need TCP/IP stack callback to netdev.
HW checksum is already supported, you can simply enable NET_ARCH_CHKSUM: https://github.com/apache/incubator-nuttx/blob/master/net/utils/Kconfig#L6-L26
That API is to pass the data pointer to an Hardware CRC accelerator, in the case of the S32K3XX this doesn't work because the MAC itself has a build-in CRC checker that indicates that in the eth_desc_s IPCE mask when the packet has been received.
#define EMAC_RDES1_IPCE_MASK (0x00000080u) /* IP Payload Error bit * IP payload checksum (that is, the TCP, UDP, or ICMP checksum) * calculated by the MAC does not match the corresponding checksum * field in the received segment. */
So, the hardware can check the checksum for receiving, but can't generate checksum for sending? In this case, we may need add new option to disable the checksum for one direction like CONFIG_NET_UDP_CHECKSUMS.
could be a solution, but we need consider that:
- Since it is impossible to forecast the incoming packet size, we have allocate the full packet in the receiving side
- Need consider the fragmentation problem and the house keeping overhead
- How to accumulate the transmit data to form the full packet without reallocate/copy
- The MTU of the specific interface determines the allocation size i.e. as shown in table above, Ethernet 1518, WIFI 576, CAN2.0B 13.
- Something like O1Heap solves the fragmentation, indeed at the expense of overhead but that's the trade-off
- See 1 Allocate the MTU of the interface, maybe add 4 bytes for a pointer to link them.
There really not a 1 solution fits all, but I want to make sure though we've got something can easily be adapted to support all MAC controllers with DMA in NuttX.
Sure. another possible solution is to reuse IOB chain, so we can define a small IOB buffer size(the smallest MTU on the device) and link multiple IOB for the bigger MTU.
So, the hardware can check the checksum for receiving, but can't generate checksum for sending? In this case, we may need add new option to disable the checksum for one direction like CONFIG_NET_UDP_CHECKSUMS.
TX Checksum is done in the MAC as well, a solution for that is just to enable NET_ARCH_CHKSUM for TX only and make a dummy function, since MAC itself will fill the dummy bytes with the checksum.
But, the process can directly handle in either irq handler and work callback before pass the data to TCP/IP stack. What I can't understand is why we need TCP/IP stack callback to netdev.
Most of it can be done in the IRQ handler, but the reference that the TCP/UDP checksum was correct gets lost then, hence the need for callback. I guess we can also just drop the packet when we see this but then the TCP/UDP wouldn't know that a corrupted packet has been received.
Most of it can be done in the IRQ handler, but the reference that the TCP/UDP checksum was correct gets lost then, hence the need for callback. I guess we can also just drop the packet when we see this but then the TCP/UDP wouldn't know that a corrupted packet has been received.
This can be corrected by increasing the count in g_netstats directly in irq/work handler.
@PetervdPerk-NXP - This is really nice to see. The Diagrams rock!
We should definitely be using Scatter Gather DMA. The approach of using many little IOBs will cause a high load on the number of statically allocated DMA descriptors (TCD) needed (32 bytes per and some non-net devices need 4-6 per transaction already). There may be a size for the built in data IOB that is a good balance. An alternate could be to reference the data and carry a size
and used
and maybe a next
reference in the IOB and not include the data allocation in the struct. These can be allocated at initialization time and static at run time. Then pools can be formed and used by the devices with MTU requirement. It could fail over to allocating a bigger MTU on smaller starvation and manage it with the size
and used
to return it to the correct pool.
Currently almost all Ethernet MAC with DMA roughly work in the following manner
As result the same packet data will be in
Ideally we want the MAC DMA Engine to directly copy the packet to the having the following scheme.
The biggest problem is though that MAC has it's own representation of the DMA descriptor + buffer. IMXRT https://github.com/apache/incubator-nuttx/blob/5d12e350da31324e632ee7da3a3062b05212b74c/arch/arm/src/imxrt/hardware/imxrt_enet.h#L650-L662 STM32H7 https://github.com/apache/incubator-nuttx/blob/5d12e350da31324e632ee7da3a3062b05212b74c/arch/arm/src/stm32h7/hardware/stm32_ethernet.h#L662-L670
When adding the DMA Descriptor to the IOB as proposed in #6834 each IOB in the case of IMXRT would by grow by 29bytes.
I can think of some solutions but I'm not sure what would be best:
iob_dma
that has metadata of the DMA descriptor and pheripheral. Where the Network stack can invoke callbacks to the pheripheral to parse the received metadata (also nice to check for HW offloading and then fallback to the SW method)Another thing I want to address is to support the HW offloading bits, there are ethernet MAC's that have IP offloading features such as checksum calculation done by MAC and simply indicate using a bit that the CRC in the DMA descriptor, or indicate that an automatic ARP response has been send.