STMicroelectronics / STM32CubeH7

STM32Cube MCU Full Package for the STM32H7 series - (HAL + LL Drivers, CMSIS Core, CMSIS Device, MW libraries plus a set of Projects running on all boards provided by ST (Nucleo, Evaluation and Discovery Kits))
https://www.st.com/en/embedded-software/stm32cubeh7.html
Other
489 stars 302 forks source link

NUCLEO-H723ZG, LwIP_HTTP_Server_Socket_RTOS: Ethernet receive fails because DCache is not invalidated in pbuf_free_custom #244

Closed futurezeb closed 5 months ago

futurezeb commented 1 year ago

Describe the set-up

I am transmitting random packets of length=255 byte to the mcu and back to the pc in a endless loop. The packet contains random data + a crc32 checksum of that data at the end. At random times, when calculating that packet checksum on the mcu, the checksums do not match. That means that the receiving of that packet failed.

Describe the bug When the pbuf, that is free'd in the function pbuf_free_custom, is not invalidated afterwards (using SCB_InvalidateDCache_by_Addr), 32 bytes of the old contents of that pbuf randomly appear when receiving a message in a later packet.

How To Reproduce

  1. Explicitly invalidate the contents of the freed pbuf in the function like this

    
    void pbuf_free_custom( struct pbuf* p )
    {
    struct pbuf_custom* custom_pbuf = (struct pbuf_custom*) p;
    
    uint8_t* data = (uint8_t*) ( p->payload );
    if( p->len >= 8 )
    {
    *( data + 0 ) = 0x77;
    *( data + 1 ) = 0x77;
    *( data + 2 ) = 0x77;
    *( data + 3 ) = 0x77;
    
    memset( data + 4, 0x88, p->len - 8 );
    
    *( data + ( p->len - 1 ) ) = 0x55;
    *( data + ( p->len - 2 ) ) = 0x55;
    *( data + ( p->len - 3 ) ) = 0x55;
    *( data + ( p->len - 4 ) ) = 0x55;
    }
    
    LWIP_MEMPOOL_FREE( RX_POOL, custom_pbuf );
    
    ... // remaining function

}

2. Send tcp packets to controller and back to pc in a loop.
3. Check if the packets sent and received are equal.
4. Soon (pretty much instant actually), this check will fail and the received packet will contain 32 bytes of 0x88 (sometimes including the 0x55 at the end).

Screenshot of wireshark (log attached)
![grafik](https://user-images.githubusercontent.com/118378167/202245469-daf4293d-d850-46d1-bd25-227a19db22de.png)

**My solution**
My solution add this to the function `pbuf_free_custom`, right before the line `LWIP_MEMPOOL_FREE( RX_POOL, custom_pbuf );`:

void pbuf_free_custom( struct pbuf* p ) { ... // start of the function

for( pbuf q = p; q != nullptr; q = q->next ) { SCB_InvalidateDCache_by_Addr( (uint32_t) q->payload, q->len ); }

LWIP_MEMPOOL_FREE( RX_POOL, custom_pbuf );

... // remaining function }



After adding these lines, the problem does not appear anymore.

I will validate this solution further in the following days and then commit a PR when I am sure this fixes the bug.

**Wireshark log**
[st-cache-receive.zip](https://github.com/STMicroelectronics/STM32CubeH7/files/10024097/st-cache-receive.zip)
futurezeb commented 1 year ago

I have done some further research, the combination of cache + DMA is known to be tricky. See this forum post: Hint: DMA and cache coherency

When the DMA writes data to the RxBuffers, it ignores the dcache. Any previously cached data is still valid, even after the pbuf memory is freed and overwritten by the DMA.

So I am pretty confident that it is required to clear the dcache after freeing the pbuf memory, otherwise the memory written to by HAL_ETH_ReadData will contain invalid cache entries. A subsequent read of that memory location will read garbage bytes from the cache.

ASELSTM commented 1 year ago

ST Internal Reference: 158119

ASEHSTM commented 5 months ago

Hello @futurezeb,

Thank you for your report. We partially reproduced your scenario:

However, we were unable to reproduce the issue. Please allow me to close this issue. You can reopen it at any time if you have any details to share with us.

You can find an example with Dcache management here.

with regards,