STMicroelectronics / stm32h7xx-hal-driver

Provides the STM32Cube MCU Component "hal_driver" of the STM32H7 series.
BSD 3-Clause "New" or "Revised" License
97 stars 41 forks source link

Ethernet PTP timestamp retrieval is entirely wrong #63

Open chris1seto opened 3 months ago

chris1seto commented 3 months ago

Caution The Issues are strictly limited for the reporting of problem encountered with the software provided in this project. For any other problem related to the STM32 product, the performance, the hardware characteristics and boards, the tools the environment in general, please post a topic in the ST Community/STM32 MCUs forum

Describe the set-up Custom board or nucleo,

Describe the bug In HAL_ETH_ReadData(), the ptp timestamp retrieval is completely incorrect. If timestamping is enabled, is at least one descriptor is given by DMA which is the normal descriptor. Then, the next descriptor following it will be the context descriptor. The way the HAL_ETH_ReadData() function is written, it will exit as soon as it sees a normal descriptor (rxdataready will be set to 1). The next call of the function would then read the context, but one single call to this function should read both the normal packet information AND the timestamp data for that packet.

How To Reproduce NA

  1. How we can reproduce the problem Enable timestamping, notice you need two calls to HAL_ETH_ReadData() to get the packet first, then the timestamp second

Additional context ST needs to pay attention when writing peripheral drivers. The reason for this issue is that the MAC IP block for the Stm32F7 (and similar parts) works differently. The timestamp comes in as an extension to a normal descriptor (An "enhanced" descriptor"). This means that it's not an extra descriptor which is generated, which is unlike the Stm32H7's MAC IP block (which makes an additional "context" descriptor"). This is clearly a copy and paste error. Additionally, this is evidence that ST didn't test the PTP functionality, because they would have immediately noticed this issue.

Screenshots If applicable, add screenshots to help explain your problem.

kartben commented 3 months ago

FWIW the Zephyr version of the HAL seems to have a patch for this https://github.com/zephyrproject-rtos/hal_stm32/pull/158/files - not sure why it never made it in the official HAL

chris1seto commented 3 months ago

@kartben Thanks for noticing that. That changeset is setup very similar to how I implemented this, but I left a comment in the PR with regard to how context descriptors get given back to DMA via OWN bit setting.

ASEHSTM commented 3 months ago

Hello @chris1seto,

Thank you for your report. To assist you more effectively, could you please specify which microcontroller you are using on your custom board or which Nucleo board you are using? Additionally, more details about your current configuration would be very helpful, including the version of the HAL library, the Ethernet DMA configuration, the PTP timestamping settings.

This information will help us better understand your environment and provide more precise assistance. Thank you in advance for your cooperation.

With Regards,

chris1seto commented 3 months ago

Hi @ASEHSTM . This is all H7 series parts. I am using a Nucleo H723, but again, this affects all H7 parts. I have code for my PTP setup here: https://community.st.com/t5/stm32-mcus-products/correct-settings-to-timestamp-only-ptp-packets-stm32h723/m-p/696924/highlight/true#M254979

This issue does not need code to explain, however. It's an obvious issue reading through the ref manual and comparing to the implementation by ST in the hal.

chris1seto commented 3 months ago

@ALABSTM @kartben FYI: I have a full example of PTP I pushed here which demonstrates this issue: https://github.com/chris1seto/Stm32H723NucleoPtpExample/tree/main

KarstenHohmeier commented 1 month ago

I am trying to get PTP to work on STM32 H7 and adopted the Zephyr OS patch into our code base. I noticed that since the patch only peeks ahead the trailing context descriptor is not fully processed like the previous descriptors (first to last), the descriptor counts are not properly incremented and the descriptor is not returned to the DMA. If the descriptor counts are not incremented, the trailing context descriptor does not become part of the recently fixed tail pointer calculation (see ceda3ce). The patch as it is will imho be buggy if applied to the current master.

KarstenHohmeier commented 1 month ago

Our implementation also tries to support PTP over Ethernet/L2 by hooking via LWIP_HOOK_UNKNOWN_ETH_PROTOCOL and LWIP_HOOK_FILENAME. This broke for us because the trailing context descriptor was processed with a bufflength > 0 which caused a wrongly built pbuf chain that looked like a big fragmented packet to our hook function. But that was ultimately our own fault.