GCC -O0 optimization code fails to load DMA B Control Block ~70% of time

Spud-maker commented 4 years ago

Hi Jayben, I have your code, and my modified code for MCP3208 working, after a long debugging of the DMA B CB load.

I found that when compiled with GCC -O0 (default) often (~70%) the current control block address' and next control block address, and all the bitfield registers for DMA B were initialized with 0 instead of the correct addresses or values in mmap()'ed uncached memory after start_dma(mp, DMA_CHAN_B, &cbs[3], DMA_PRIORITY(15)) is called. With either -O1 or -O2 optimization the code seems to work perfectly every time. (I am using the default set of RaspberryPi.org distributed Raspbian with the included gcc as of July 2020)

Not sure what would cause this but wanted to point it out to help anyone else. Might be some timing issue with MAP_SHARED mmap() option implementation.

I did also notice that when working, the actual memory addresses loaded are in yet another uncached memory area located at 0xDE3F???? instead of my CB locations in the mmap()'ed area 0xB6F4????. Suspect this is part of the shared memory implementation but could find no information on that.

Thanks again for this example code.

jbentham commented 4 years ago

Thanks very much for the feedback; it is really strange, since I did the bulk of my code development with no compiler optimisation, so must have done hundreds of tests in this way. When I've seen unexpected register values, it has sometimes been due to my accidentally running the DMA controller (or leaving it running), so I was seeing the end-result of the DMA cycles, not the beginning.

I'm in the middle of another project, but hopefully will be able to return to my DMA code soon.

On 18/08/2020 00:03, Spud-maker wrote:

Hi Jayben, I have your code, and my modified code for MCP3208 working, after a long debugging of the DMA B CB load.

I found that when compiled with GCC -O0 (default) often (~70%) the current control block address' and next control block address, and all the bitfield registers for DMA B were initialized with 0 instead of the correct addresses or values in mmap()'ed uncached memory after start_dma(mp, DMA_CHAN_B, &cbs[3], DMA_PRIORITY(15)) is called. With either -O1 or -O2 optimization the code seems to work perfectly every time. (I am using the default set of RaspberryPi.org distributed Raspbian with the included gcc as of July 2020)

Not sure what would cause this but wanted to point it out to help anyone else. Might be some timing issue with MAP_SHARED mmap() option implementation.

I did also notice that when working, the actual memory addresses loaded are in yet another uncached memory area located at 0xDE3F???? instead of my CB locations in the mmap()'ed area 0xB6F4????. Suspect this is part of the shared memory implementation but could find no information on that.

Thanks again for this example code.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jbentham/rpi/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNUSEU4Y3VNXQTVCJOCQVLSBGZMDANCNFSM4QCLODIQ.

ajs410 commented 4 years ago

This may or may not be related.

DMA_CHAN_B is set to use channel 11. Using an RPi 4 and the latest Raspbian, I watched the DMA registers. Channel 11 has the DISDEBUG bit set, which is usually a sign that someone somewhere has used that channel, since it's clear after a reset.

Furthermore, if you query the mailbox interface for the DMA channels in use, it returns a bitmask that says 11 is in use. For that matter, it also says channel 10 (DMA_CHAN_A) is in use, too - that channel also has the DISDEBUG bit set.

I don't know if the above only applies to an RPi 4. Considering how your problem goes away with optimizations, I can't imagine this is the issue you're having - it should fail with or without optimizations if what I mention is your problem. But it does make me wonder whether the behavior you're seeing is a result of channel 11 possibly being in use.

jbentham / rpi

GCC -O0 optimization code fails to load DMA B Control Block ~70% of time #2