STMicroelectronics / STM32CubeH7

STM32Cube MCU Full Package for the STM32H7 series - (HAL + LL Drivers, CMSIS Core, CMSIS Device, MW libraries plus a set of Projects running on all boards provided by ST (Nucleo, Evaluation and Discovery Kits))
https://www.st.com/en/embedded-software/stm32cubeh7.html
Other
495 stars 303 forks source link

SPI example using DMA: Add cache maintenance operation or separate dma_buffer section #153

Closed robamu closed 1 year ago

robamu commented 3 years ago

Hello,

I think this has been an issue for many other people as well, and I think it would be nice if this is added in the code for all SPI examples, even if the example works without it (at least thats what I'm suspecting).

Using DMA with SPI can be problematic if some cache maintenance operations are missing. This website goes in-depth and also provided the solution that worked for me: https://community.st.com/s/article/FAQ-DMA-is-not-working-on-STM32H7-devices

The DMA example code provided uses a pre-declared buffer with a string. I have another fairly common use-case: I tried to read a L3GD20H gyroscope sensor, which meant that I filled the TX buffer with the according command bytes before initiating a DMA transfer. This was problematic, and DMA did not work where the polling mode worked. The read WHO AM I register value was not the expected one. The link above explains why this could be problematic and also offers a solution in form of a cache maintenance operation before initiating the DMA transfer:

    /* Clean D-cache */
    /* Make sure the address is 32-byte aligned and add 32-bytes to length,
    in case it overlaps cacheline */
    // See https://community.st.com/s/article/FAQ-DMA-is-not-working-on-STM32H7-devices
#if STM_USE_PERIPHERAL_TX_BUFFER_MPU_PROTECTION == 0
    SCB_CleanDCache_by_Addr((uint32_t*)(((uint32_t)txBuffer.data()) & ~(uint32_t)0x1F),
            txBuffer.size()+32);
#endif

    if(HAL_SPI_TransmitReceive_DMA(spiHandle, txBuffer.data(), rxBuffer.data(), 2) != HAL_OK) {
     // Error handling
    }

Adding the operation fixes my issue and I was able to read the WHO AM I register of the sensor without issue. An alternative specified in the link is also to create a new .dma_buffer section, protect it with the MPU as cacheable, non bufferable non-shareable, for example with the following piece of code

#if STM_USE_PERIPHERAL_TX_BUFFER_MPU_PROTECTION == 1
    // Protect DMA buffer for other peripherals
    MPU_InitStruct.Enable = MPU_REGION_ENABLE;
    MPU_InitStruct.BaseAddress = 0x30000000;
    MPU_InitStruct.Size = MPU_REGION_SIZE_1KB;
    MPU_InitStruct.AccessPermission = MPU_REGION_FULL_ACCESS;
    MPU_InitStruct.IsBufferable = MPU_ACCESS_NOT_BUFFERABLE;
    MPU_InitStruct.IsCacheable = MPU_ACCESS_CACHEABLE;
    MPU_InitStruct.IsShareable = MPU_ACCESS_NOT_SHAREABLE;
    MPU_InitStruct.Number = MPU_REGION_NUMBER1;
    MPU_InitStruct.TypeExtField = MPU_TEX_LEVEL0;
    MPU_InitStruct.SubRegionDisable = 0x00;
    MPU_InitStruct.DisableExec = MPU_INSTRUCTION_ACCESS_ENABLE;

    HAL_MPU_ConfigRegion(&MPU_InitStruct);
#endif

and then place the TX buffer into that section via linker script or IDE setting. In any case, I think it might be nice if the SPI example are updated to include that maintance operation and/or the DMA buffer section with the MPU protection. I'm probably not the only one who simply copies large segment of the STM32H7Cube code into their own project, hoping it works for my own specific issue immediately. Simply adding the operation and maybe referring to the link above in the code documentation would go a long way to make the DMA work out of the box for more use-cases.

Describe the set-up

Describe the bug Trying to read a L3GD20h gyroscope sensor with a STM32h743ZI nucleo using DMA based on the SPI example provided

How To Reproduce

  1. Attempt to read the sensor WHO AM I register without performing cache maintenance operation before initiating the DMA trasnfer.

Additional context If you have a first analysis or patch correction, thank you to share your proposal.

Screenshots If applicable, add screenshots to help explain your problem.

ASELSTM commented 3 years ago

Hi @rmspacefish,

Thank you for your contribution and for this detailed report. In order to allow a better analysis of the problem, would you please share the whole project you have used to reproduce this issue.

With regards,

robamu commented 3 years ago

I tried to compile and test a project similar to the SPI projects provided by you here: https://github.com/spacefisch/stm32h743-cmake-minimal/tree/main/test/l3gd20h . Unfortunately, I was not able to reproduce the issue with this simple case. Reading the WHO AM I registers works with and without cache maintenance here.

Just for completeness You can build the project like this

export STM32_TOOLCHAIN_PATH=<toolchainPath>
mkdir build && cd build
cmake ..
cmake --build . -j

The other project where the problem still occurs without cache maintenance can be found here: https://egit.irs.uni-stuttgart.de/fsfw/fsfw-hal/src/branch/master/stm32h7/devicetest/GyroL3GD20H.cpp But is is part of a larger framework / library. Omitting the cache maintenance on line 278 will still cause issues here. This class is also run as part of a more complex application with 10 + threads running.. Weirdly enough, some operations will work (e.g. reading sensor values) even without cache maintenance. I tried to play around with SPI speed /mode, that did not make any difference.

ASELSTM commented 2 years ago

ST Internal Reference: 120631

ASELSTM commented 1 year ago

Hi @robamu,

The SPI examples are actually running well on default STM32H743ZI nucleo board configuration (without an L3GD20h gyroscope sensor). So, Please feel free to add the required maintenance cache policy if you want to use this example as start-up point for your application.

Please allow me thus to close this thread, thank you fir your comprehension.

With regards,