embassy-rs / embassy

Modern embedded framework, using Rust and async.
https://embassy.dev
Apache License 2.0
5.54k stars 768 forks source link

stm32: DMA can't be used with DAC on some families (g4, others) #2783

Open adamgreig opened 7 months ago

adamgreig commented 7 months ago

The bug

The DAC write() method on an STM32G4 immediately faults, because the DMA is configured for an 8- or 16-bit PSIZE which is not supported by the DAC registers and generates a bus fault.

image

The same note about 32-bit access is present in all the other reference manuals I checked (F0, F1, F4, F7, H7) but it seems on those families no bus error is generated (see below), so there's no bug.

The DAC driver's write() method takes either a &[u8] or a &[u16] and sets up a DMA transfer with that buffer. The DMA driver sets up MSIZE and PSIZE based on the data type to either both be 8-bit or both be 16-bit. As soon as the DMA transfer is started, the DMA generates a bus fault and stops.

Possible fix 1

If we change the DAC driver's write() method to always take a &[u32], the correct DMA transfer is generated and everything works by-the-book on all platforms. However, the user buffer must be 2x or 4x the size, so it's a very wasteful approach.

Possible fix 2

Currently the DMA driver doesn't support PSIZE != MSIZE. If it did, on G4, we could run a transfer where MSIZE=8 or 16 bit, and PSIZE=32 bit, and the BDMA behaviour is to fill the extra bits in with 0s:

image

This works great for G4 and other BDMA platforms: the user passes a &[u8] or &[u16] and each word is zero-extended to 32 bits before being written to the DAC as a 32-bit access. However, there's no API for doing this at the moment. There's one other catch: (non-B) DMA has a completely different behaviour when MSIZE != PSIZE: it packs multiple memory reads into a single peripheral write, so a user-provided &[u16] would have two entries packed into a single u32 which is written to the peripheral, and it uses NDTR as the number of peripheral writes instead of the size of the data in memory:

image

This would be a total non-starter for DAC on DMA platforms. Luckily it seems like on those platforms the DAC doesn't generate a bus fault if you perform a 16-bit write (though I haven't comprehensively confirmed this), so the DMA can be told to do a 16-bit write and moves on with life. There's even a note about this in the BDMA section for AHB buses, which might apply to DMA writing to the DAC on APB or might not...

image

I guess this is what's happening right now with the dac_dma examples in H7, for example.

So, the best answer might be:

  1. Allow configuring non-equal PSIZE/MSIZE in the DMA API, but with some thought as to how this is totally different between BDMA and DMA, and not supported in DMA direct mode or MDMA...
  2. Have the DAC driver use PSIZE=32 on BDMA platforms with MSIZE either 8 or 16 as appropriate, but use PSIZE=MSIZE=16 on DMA platforms and assume the 16-bit write to the DAC registers works despite the reference manual saying not to do that

Other HALs'/platforms' behaviour?

Haven't checked.

DAC driver redux

Honestly the write() method isn't very useful: you can set up a single circular transfer that runs forever, or you can write a single buffer out the DAC and then stop, but there's no way to keep adding samples to a continuous, gap-less stream. It would be nice if the interface could also offer a WriteableRingBuffer you can keep adding samples to which only blocks when the buffer is to full to accept the new batch. I think that's likely a separate problem (it doesn't interfere with fixing this bug), but maybe both could be fixed at the same time.

jrmoulton commented 7 months ago

For my g4 project I've been using a fork of embassy where done option 2 and I've hardcoded the psize to WordSize::FourBytes in the bdma.

On another unrelated note about the write method not being useful: I've found that I need to create an unsafe copy of my data buffer and mutate it while the dma is running. This is more of a separate use case than the ring buffer but it wasn't obvious how to do it and I think it's a common enough use case that there should be some blessed way to do it. Something more than just manually creating an unsafe copy of a slice.