zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.93k stars 6.65k forks source link

Stress-test of SD-card on STM32 (tested w/ stm32h573i_dk) occasionally gives read error 32 (FIFO full) #76999

Open AndreyDodonov-EH opened 3 months ago

AndreyDodonov-EH commented 3 months ago

Describe the bug Occasional errors when reading SD card without pauses:

<err> stm32_sdmmc: sd read error 32
<err> fs: file read error (-5)
<err> filesystem: Failed to read -5
<err> fs: file close error (-9)
<err> filesystem: Failed to close file (9).

To Reproduce In the same thread, perform in a loop

  1. Open file
  2. Seek some location
  3. Read decent amount of bytes (I used 4096)
  4. Close file

Environment:

Zephyr 3.7.0 Tested on board stm32h573i_dk

Probable reason: It seems that HAL_SD_ERROR_RX_OVERRUN is set (hence error 32) Apparently it happens because SD card clock is too quick compared to software. After changing clk-div to 2 (the possibility was added in PR 56743) the problem doesn't occur anymore.

I'm not sure what the correct solution would be - setting this on per-board basis, changing the default (unlikely) or changing the driver (by making it configurable or respect the frequency?). Hence the issue and not a PR.

erwango commented 3 months ago

Alternatively, did you try enabling CONFIG_SDMMC_STM32_HWFC?

AndreyDodonov-EH commented 3 months ago

@erwango Yes, then I get

[00:00:33.313,000] <err> stm32_sdmmc: sd read error 2
[00:00:33.318,000] <err> fs: file read error (-5)
[00:00:33.324,000] <err> filesystem: Failed to read -5
[00:00:33.329,000] <err> fs: file close error (-9)
[00:00:33.335,000] <err> filesystem: Failed to close file (9).

And it runs even slower in total, giving errors even more often

erwango commented 3 months ago

Yes, then I get

Ok, thanks. Then, my suggestion is to update at board level. Additionally, clk-div property in binding can be updated to explicit that it could be use to ensure that SD Card clock doesn't run faster than f/w.