Open carlocaione opened 1 year ago
Tagging people involved: @FRASTM @erwango @teburd @lmajewski @NickolasLapp @andyross @nashif @hubertmis @carlescufi @npitre @stephanosio
Thanks for raising this. It is an important missing feature in Zephyr.
I have a comment regarding expanding current DMA API with discussed features:
It looks that the current DMA driver API describes standalone DMA engines capable to copy data around the bus system, based on instructions provided in job descriptors. Standalone DMA like this is just one type of DMAs we can find in MCUs. Another example is a DMA built into a peripheral like UART, I2C, or even PWM. Any DMA (standalone, embedded in a peripheral, any other) needs proper memory management if memory system is complex.
That's why I think it is better to keep current DMA API as it - responsible to handle standalone DMAs. The proposed subsystem should be separated from drivers, but there must be clear dependency path. The proposed subsystem should be usable by any driver of any device with DMA capability (UART, I2C, PWM, standalone DMA engine, or any other). Implementations of device drivers of any type should be able to depend on the new subsystem.
To sum up, I think we need separated modules:
DMA implementation can depend on DMM, not the other way around. Other device drivers also can depend on DMM.
I wonder if this could be better changed to an aligned/regioned/cached buffer API of sorts? DMA isn't the only bus master in some parts that might have requirements like this. Multicore parts with asymmetric cores might also have similar shared buffer requirements. E.g. NXP's imxrt685
Do you have a particular DMA that you are running into trouble with otherwise with the existing API?
I wonder if this could be better changed to an aligned/regioned/cached buffer API of sorts?
Yes, the DMA case was just the most common one but this is indeed a buffer management API of sorts. We can change the name if that is confusing.
Do you have a particular DMA that you are running into trouble with otherwise with the existing API?
This is secret ;) Secret aside as @hubertmis beautifully explained in https://github.com/zephyrproject-rtos/zephyr/issues/57220#issuecomment-1521306639 this concerns not only DMA engines but also generic DMA-enabled peripherals.
I don't disagree with any of that, but I'd point out that this isn't solely a DMA concern though the dmm naming implies it.
The problem is shared memory among bus masters and matching the requirements needed for doing so. So arc_buf, shm_buf, mmbuf, whatever you want to call it... its a region shared amongst bus masters. CPU cores, DMAs, or as you say yourself peripherals that can also act as bus masters reading and writing directly to memory.
I wonder if this could be better changed to an aligned/regioned/cached buffer API of sorts
I think an aligned/regioned/cached buffer API would be a complementary solution, but it would not solve all the issues pointed by @carlocaione . I think memory management for DMA-capable device drivers is something we should define first, and later we could allow optimizations providing a buffer allocator with alignment/padding/cache-awarness features.
Multicore parts with asymmetric cores might also have similar shared buffer requirements
That's a good point. I think the DMM module proposed by @carlocaione can be extended to be usable not only by DMA-capable device drivers, but also by IPC subsystems. Or a common part can be extracted to be used by both.
I think below two assumption is not correct:
Anything new about this proposal? I would like to avoid having the memcpy done by the spi driver for evey transfer, and reserving a statically DMA capable buffer is a loss of space since the bounce buffer in the spi driver will be unused. (and there is no API to get the bounce buffer from the spi driver so that we can write directly inside)
This is an extension of the issue already introduced in #36471
An implementation of the RFC can be found in #57602
The problem
In memory bus architectures, CPU running software is one of the masters accessing the bus, while DMA engines are other masters accessing the same (or other, but connected) bus.
In basic MCUs there is a simple bus architecture, where CPU and DMA engines can access the whole memory range, no caches are present, and memory addressing is consistent between all the bus masters.
The memory access in these devices can be summarized with the following rules:
In more complex systems with a more complex memory architecture the interaction between CPU and DMA, or between multiple CPUs, can be complicated:
All of the discussed challenges must be addressed by the software running in a system with a complex memory architecture.
What Zephyr is doing about this
Zephyr has no solution yet for this kind of complex platforms. There are several scattered attempts to overcome these limitations:
Proposal
I'm proposing to add a new sub-system called DMM - DMA Memory Management with the following responsabilities:
Why this RFC?
Because before starting to write the code I want to have an opinion about this and to discuss whether the introduction of a new subsystem makes sense vs expanding the current APIs (DMA or MM).