STMicroelectronics / stm32h7xx-hal-driver

Provides the STM32Cube MCU Component "hal_driver" of the STM32H7 series.
BSD 3-Clause "New" or "Revised" License
98 stars 43 forks source link

USB HCD wrong calculation of toggle_in flag with DMA enabled #41

Open ilyapas opened 1 year ago

ilyapas commented 1 year ago

Description of the set-up

Description of the bug We are using a STM32H7B3 controller in combination with a USB Ethernet Adapter based on the ASIX AX88772B chip and have DMA enabled for the USB Host Controller.

We were struggling with DTERR (data toggle) errors when receiving packets from the ASIX chip. Using ping packets of varying sizes we could pin down the issue.

Here is the code that was causing the problem.

stm32h7xx_hal_hcd.c, lines 1250ff:

    if (hhcd->Init.dma_enable == 1U)
    {
      if ((((hhcd->hc[chnum].xfer_count + hhcd->hc[chnum].max_packet - 1U) / hhcd->hc[chnum].max_packet) & 1U) != 0U)
      {
        hhcd->hc[chnum].toggle_in ^= 1U;
      }
    }
    else
    {
      hhcd->hc[chnum].toggle_in ^= 1U;
    }

The else-part for disabled DMA works fine, we have no issues when DMA is disabled.

In case DMA is enabled the current implementation only looks at whether the number of USB packets in order to receive xfer_count bytes is even or odd.

This totally ignores the handling of zero-length packets that the sender sends according to the USB spec.

Here are the proposed changes with explanatory comments that have been verified with 3 different USB device types connected to the Host Controller:

    if (hhcd->Init.dma_enable == 1U)
    { 
      if (hhcd->hc[chnum].xfer_len == hhcd->hc[chnum].max_packet)
      {
        /* If xfer_len (the total requested data size) is equal to the max packet size of the USB endpoint
           then XFRC interrupt will be triggered for each incoming USB packet separately, including ZLPs.
           Thus we need to toggle on each XFRC interrupt. */
        hhcd->hc[chnum].toggle_in ^= 1U;
      }
      else if (hhcd->hc[chnum].xfer_count % hhcd->hc[chnum].max_packet == 0)
      {
        /* Else, if xfer_count (received data size) is a multiple of the max packet size of the USB endpoint
           we need to account for ZLPs that are sent by the sender and signalise a completed transmission to
           the host controller. */
        if ((((hhcd->hc[chnum].xfer_count / hhcd->hc[chnum].max_packet) & 1U) == 0U)
           && (hhcd->hc[chnum].xfer_count != hhcd->hc[chnum].xfer_len))
        {
          /* If the received amount of data is an even multiple of the max packet size of the USB endpoint
             then we generally want to toggle because of the additional ZLP that is sent by the sender 
             and that does not generate a separate XFRC interrupt with DMA enabled. 

             The exception from this case is if we received the exact amount of data that has been requested.
             Then the XFRC interrupt will be triggered immediately and the following ZLP will be received
             as a separate transfer, triggering another XFRC interrupt. In this case we trigger only once 
             on the ZLP XFRC interrupt. 

             This branch also includes the ZLP itself (xfer_count = 0). If we get a separate XFRC interrupt for it,
             then we want to toggle.*/
          hhcd->hc[chnum].toggle_in ^= 1U;
        }
      }
      else
      {
        if ((((hhcd->hc[chnum].xfer_count + hhcd->hc[chnum].max_packet - 1U) / hhcd->hc[chnum].max_packet) & 1U) != 0U)
        {
          /* If xfer_count is not a multiple of max_packet, then we just need to toggle if the number of 
             USB packets required for the received data is odd  */
          hhcd->hc[chnum].toggle_in ^= 1U;
        }
      }
    }
    else
    {
      /* With DMA disabled the XFRC interrupt is generated for each USB packet separately,
         thus we need to toggle every time. */
      hhcd->hc[chnum].toggle_in ^= 1U;
    }

Could you please confirm if this is a real issue and not a misconfiguration of our part as well as verify the proposed changes?

This problem (or at least a very similar one) has already been reported once here #28, but has been discarded since the bug report was in Chinese.

Kind regards Ilya