adafruit / Adafruit_ZeroDMA

DMA helper/wrapped for ATSAMD21 such as Arduino Zero & Feather M0
MIT License
59 stars 24 forks source link

Mixing SPI DMA and normal calls not working, no callback function for SPI DMA complete? #24

Open RudolphRiedel opened 3 years ago

RudolphRiedel commented 3 years ago

I am trying to make use of the DMA feature for SPI with a Metro M4, are there any current examples for this?

SPI.transfer(&buffer, 0, 1234, false);

Or to be more precise, what I am actually using is: SPI.transfer( ((uint8_t ) &EVE_dma_buffer[0])+1, 0, (((EVE_dma_buffer_index)4)-1), false );

This works, but only once.

The following normal SPI.transfer() calls for single bytes seem to be working with DMA as well. This is what it should look like, this is from before using DMA transfer: grafik

And this is how it actually looks: grafik

With this my program hangs, there is nothing else on the SPI and the LED is not blinking.

And without callback function it overall looks like this now: grafik grafik

From that I gather from https://github.com/adafruit/ArduinoCore-samd/blob/master/libraries/SPI/SPI.cpp the only option right now would be to call SPI.waitForTransfer()? This makes DMA kind of pointless.

Without DMA it works fine for a while now: grafik

But I would like to avoid blocking the CPU for this 291µs period, even more so as this is only from a very short and basic example of my FT81x/BT81x code library: https://github.com/RudolphRiedel/FT800-FT813

ladyada commented 3 years ago

try these examples https://github.com/adafruit/Adafruit_ZeroDMA/tree/master/examples

RudolphRiedel commented 3 years ago

While the ZeroDMA examples were the first I found I am trying to use this one: https://github.com/adafruit/ArduinoCore-samd/blob/master/libraries/SPI/SPI.cpp

void SPIClass::transfer(const void txbuf, void rxbuf, size_t count, bool block) {

So this really is not a ZeroDMA issue but an issue with the new SAMD SPI class.

ladyada commented 3 years ago

yah we never used it, if you have ideas for how to fix please submit a PR :)

rostamiz commented 3 years ago

@RudolphRiedel I think I also ran into the same issue. Like you said calling SPI.waitForTransfer() before making the next transfer addresses the issue, but makes DMA pointless.

Not sure if this is exactly relevant to your use case, but as a workaround, instead of calling SPI.waitForTransfer() right after starting the transfer, I start a timer (in your case it would be for about 291us) and continue on with with other program logic and then call SPI.waitForTransfer() once the 291us has elapsed, which means SPI.waitForTransfer() should more-or-less return immediately (assuming your transfers all take about 291us).

I think a cleaner solution, if feasible, would be to add a method SPI.isTransferComplete() that returns a status immediately (i.e., whether the transfer is complete), so the client code doesn't need to maintain any timing logic and is also robust to varying transfer times.

RudolphRiedel commented 3 years ago

@ladyada Well okay, this is not what I was hoping for as I am only supporting the use of Arduino but not using it myself. Fair enough, I'll see to have a look. Someone must have an idea how this is supposed to be used though. And please move it back, thank you.

@rostamiz Actually no, the first issue can not be solved with waiting, when I issue the next SPI.transfer(0xnn) calls this is 5ms later. A simple SPI.transfer(0x00) should be using DMA but as there is no delay between the five bytes and as my code is putting CS back to high way before the transfer is done it very much looks like it still is using DMA.

And there is no way to tell how much time the transfer will actually take, like I wrote, this is only my basic demo, actual projects need to send a lot more data, up to 4k. Polling the status would be an option, not a good one but still better than to wait for completion, only right now there is no function for this.

ladyada commented 3 years ago

the SPI library just isn't designed for DMA... there's a benefit to having DMA used even if it blocks, because the inter-byte delays are removed. please use the zerodma library :)

RudolphRiedel commented 3 years ago

Okay, I'll check out ZeroDMA. But what is the point of DMA support in the SPI library and even a release-note for it that claims it has been fixed when it is not really working and not been used? Maybe this should be removed or at least marked as unfinished to avoid further confusion?

ladyada commented 3 years ago

it doesnt hardfault, we dont guarantee it solves all your life problems :D

RudolphRiedel commented 3 years ago

I switched over to using ZeroDMA but this also does not work as expected. First of all, I have the same issue as before, bytes send with SPI.transfer following the DMA are apparently still send with DMA. And the data is all messed up, this is not what my program tries to send: grafik

This is how this exact same sequence looks like when I turn off DMA: grafik

The difference between using DMA and not using DMA is that without DMA a block of 224 bytes is send directly using SPI.transfer(data) and when using DMA this data is written to a buffer first and then the transfer is triggered.

static inline void spi_transmit(uint8_t data) { SPI.transfer(data); }

static inline void spi_transmit_32(uint32_t data) { spi_transmit((uint8_t)(data)); spi_transmit((uint8_t)(data >> 8)); spi_transmit((uint8_t)(data >> 16)); spi_transmit((uint8_t)(data >> 24)); }

/ spi_transmit_burst() is only used for cmd-FIFO commands so it always has to transfer 4 bytes / static inline void spi_transmit_burst(uint32_t data) {

if defined (EVE_DMA)

EVE_dma_buffer[EVE_dma_buffer_index++] = data;

else

spi_transmit_32(data);

endif

}

void EVE_init_dma(void) { myDMA.setTrigger(SERCOM2_DMAC_ID_TX); myDMA.setAction(DMA_TRIGGER_ACTON_BEAT); myDMA.allocate(); myDMA.setCallback(dma_callback); desc = myDMA.addDescriptor( NULL, / from / (void ) &SERCOM2->SPI.DATA.reg, / to / 100, / size / DMA_BEAT_SIZE_BYTE, / beat size -> byte / true, / increment source / false); / increment dest */ }

void EVE_start_dma_transfer(void) { myDMA.changeDescriptor(desc, (void ) (((uint32_t) &EVE_dma_buffer[0])+1), NULL, (EVE_dma_buffer_index4)-1); EVE_cs_set(); EVE_dma_busy = 42; myDMA.startJob(); // SPI.transfer( ((uint8_t ) &EVE_dma_buffer[0])+1, 0, (((EVE_dma_buffer_index)4)-1), false ); }

Everything else stays the same, the non-DMA transfer functions are left untouched when switching between DMA and non-DMA with a define.

I have no idea yet how this is even possible since DMA transfers are triggered by setting the ENABLE bit in the CGCTRLA register for the DMA channel and not by writing to the SPI data register.

And this is definately not an issue with the D51 itself, I implemented DMA support for the E51 back in april for my bare-metal target after I was using DMA with the C21 for some time.

And I found a new issue, the callback is called too early: grafik

This needs to be fixed in the callback function since ZeroDMA is generic and not for SPI only. In this case I went with: void dma_callback(Adafruit_ZeroDMA *dma) { while(SERCOM2->SPI.INTFLAG.bit.TXC == 0); EVE_dma_busy = 0; EVE_cs_clear(); }

Well, the DMAC-0_Handler() in my bare-metal implementation does this as well.

RudolphRiedel commented 3 years ago

After some more fiddling and comparing the code back and forth with my bare metal implementation I got it working. As ZeroDMA is not aware of the interface in use it does not do anything with it either, like with the issue of calling the callback function too early.

I setup a write-only DMA transfer since this is what I need. Unfortunately the ATSAM SERCOM units do not allow ignoring incoming data. And you must read the SPI.DATA register, there is no other way to clear the receive-flag. I am using "(void) EVE_SPI->SPI.DATA.reg;" in my code for this purpose. With normal transfers this means not reading back the input makes the SERCOM unit stop working after the first written byte.

But apparently this is different when using DMA, the whole buffer is still transferred just fine, otherwise I would have caught this sooner. And when going back to normal transfers things behave very strange. Reading does either not work at all anymore or the result is shifted by one byte so that what is actually beeing read is not what you see on the logic-analyzer. Or something way more broken since the pauses between the written bytes are gone and chip-select goes up too early.

I fixed it by switching of the the receiver before transmitting the block and switching it back on in the callback-function.

void dma_callback(Adafruit_ZeroDMA dma) { while(SERCOM2->SPI.INTFLAG.bit.TXC == 0); SERCOM2->SPI.CTRLB.bit.RXEN = 1; / switch receiver on by setting RXEN to 1 which is not enable protected */ EVE_dma_busy = 0; EVE_cs_clear(); }

void EVE_start_dma_transfer(void) { myDMA.changeDescriptor(desc, (void ) (((uint32_t) &EVE_dma_buffer[0])+1), NULL, (EVE_dma_buffer_index4)-1); SERCOM2->SPI.CTRLB.bit.RXEN = 0; / switch receiver off by setting RXEN to 0 which is not enable-protected / EVE_cs_set(); EVE_dma_busy = 42; myDMA.startJob(); }

And this is exactly what is broken in https://github.com/adafruit/ArduinoCore-samd/blob/master/libraries/SPI/SPI.cpp as well. While it does allow a write-only tranfer, it does not switch off the receiver.

It works fine now and the time for a display-refresh went down from 291µs which included the SPI transfer to only 15µs for composing the buffer and sending it over SPI using DMA.

RudolphRiedel commented 3 years ago

So I had a glimpse at SPI.cpp to fix DMA for write-only transfers. But SPI.cpp is not using the SERCOM units directly, it goes thru SERCOM.cpp. And SERCOM.cpp has no method to switch over to write-only.

As a workaround I allow the DMA to overwrite my output buffer with the incoming "data": SPI.transfer( ((uint8_t ) &EVE_dma_buffer[0])+1, ((uint8_t ) &EVE_dma_buffer[0]), (((EVE_dma_buffer_index)*4)-1), false );

Now only the issue of the SPI.waitForTransfer() beeing the only option to determine that DMA is done remains.

RudolphRiedel commented 3 years ago

I put in a pull-request for SPI.cpp with an added method isBusy() to allow for non-blocking polling of the DMA transfer status. I would prefer a callback function but I am not sure where to even start with this.