Open biglben opened 1 month ago
We're also seeing this. For example when there's incoming network traffic while mcumgr is erasing internal flash pages (causing program execution to stop).
After the "Failed to obtain RX buffer" message we can't communicate on the network any more.
Thanks @biglben , you've saved me a lot of debugging time.
I can confirm that the fix suggested by @biglben works for us too.
I have tested the modification described in the commit and can confirm that it works as expected. Thank you @biglben
Hi @biglben Don't hesitate to open a PR containing the fix.
Hi @marwaiehm-st I can open a PR with the fix for STM32H7, but i am not sure which other series have the same issue (i assume h5 too, but can not test). I am not sure if this fix should be included in a HAL Update. There are more fixes in the stm32h7xx-hal-driver repo which are not included in the zephyr fork
@biglben Sure, you can open a PR which with a commit cherrypicked from STM32H7 HAL. See https://github.com/zephyrproject-rtos/hal_stm32/pull/226 as example
Describe the bug On the STM32H7, high incoming traffic combined with a busy application can cause the Ethernet peripheral to enter a state where it fails to receive data but continues transmitting. The Ethernet receive DMA channel becomes stuck in the suspended state (visible in RPS0 field in the ETH_DMADSR register). I identified a fix in the stm32h7xx_hal_driver repository that addresses this issue by correctly setting the tail pointer (commit ceda3ce). With this fix applied, I tested various burst patterns, and the Ethernet functionality remained stable. I am raising this issue to highlight that the STM32HAL needs to be updated and to ensure that other STM32 series (likely H7 and H5) receive the same fix. Sharing this information may save others considerable debugging time (it took me about 2 days).
To Reproduce I reproduced the bug by applying the following patch to simulate the application performing other tasks or being blocked:
building using
west build -p -b nucleo_h743zi/stm32h743xx zephyr/samples/net/sockets/echo_server
After target was ready to receive data, I ran this script:using
python udp_flood.py 192.0.2.1 4242 143 1000 0.1
Expected behavior The STM32 Ethernet, under heavy incoming traffic, should simply lose some packets but continue operating without interruption.
Impact None, as I have forked the STM32 HAL module and applied commit ceda3ce.
Logs and console output
After this point, nothing is received anymore.
Environment: