openthread / ot-br-posix

OpenThread Border Router, a Thread border router for POSIX-based platforms.
https://openthread.io/
BSD 3-Clause "New" or "Revised" License
397 stars 220 forks source link

When RCP resets, otPlatRadioEnableSrcMatch() is not set back to TRUE by Host #2261

Closed EricSTMicro closed 3 months ago

EricSTMicro commented 3 months ago

Describe the bug A clear and concise description of what the bug is. When the RCP failure recovering feature (OPENTHREAD_SPINEL_CONFIG_RCP_RESTORATION_MAX_COUNT) is activated, after it resets the RCP due to a spinel/HLDC issue (bad frame received/sent) or reset (using reset button of the board), the Host (RPi) doesn't enable the src addr match table (otPlatRadioEnableSrcMatch). The Host only enables it once when the system starts, when OTBR becomes leader for example.

The result is that the RCP will set the pending bit to 1 in all of the data requests coming from the child it is associated with.

If we check at otPlatRadioEnableSrcMatch definition:

/**
 * Enable/Disable source address match feature.
 *
 * The source address match feature controls how the radio layer decides the "frame pending" bit for acks sent in
 * response to data request commands from children.
 *
 * If disabled, the radio layer must set the "frame pending" on all acks to data request commands.
 *
 * If enabled, the radio layer uses the source address match table to determine whether to set or clear the "frame
 * pending" bit in an ack to a data request command.
 *
 * The source address match table provides the list of children for which there is a pending frame. Either a short
 * address or an extended/long address can be added to the source address match table.
 *
 * @param[in]  aInstance   The OpenThread instance structure.
 * @param[in]  aEnable     Enable/disable source address match feature.
 *
 */
void otPlatRadioEnableSrcMatch(otInstance *aInstance, bool aEnable)

We can see that if the src addr match table is not enabled, we set the pending bit in every data request, so we are behaving normally, but the src addr match table should be enabled.

The intention of this feature is that the Host doesn't have to reset the whole Thread application, but only the RCP, so it is transparent for the other devices in the network that something wrong occured. We want to use this feature as we experience some issues using the spinel/HDLC over UART. We sometimes miss one or several bytes and it's too bad to reset the OTBR application just for that.

To Reproduce Information to reproduce the behavior, including:

  1. Git commit id

  2. IEEE 802.15.4 hardware platform

    • STM32WB55
  3. Build steps

    • Build using the flag -DOT_RCP_RESTORATION_MAX_COUNT set to anything but 0
    • In my case:
      sudo WEB_GUI=1 REFERENCE_DEVICE=1 ./script/bootstrap
      sudo OTBR_OPTIONS="-DOT_RCP_RESTORATION_MAX_COUNT=2" INFRA_IF_NAME=wlan0 WEB_GUI=1 REFERENCE_DEVICE=1 ./script/setup
  4. Network topology

    • An OTBR (RPi + STM32WB) and a SED (STM32WB55) that are associated, simply by setting same networkkey, panid, etc... The SED makes a data request every second.

Expected behavior A clear and concise description of what you expected to happen. Host should enabled the src addr match table on the RCP, so that it doesn't set the frame pending bit to 1, and doesn't send an empty data frame after that.

Workaround A simple workaround is to call otPlatRadioEnableSrcMatch(*InstanceOT, TRUE) in RCP's FW at reset, but I'm not sure about what it implies.

zhanglongxia commented 3 months ago

I think the root cause is that the RCP restore feature missed setting the source match feature. I have pushed a commit PR10055 to resolve this issue. Could you try this commit to check if it could resolve your issue?

EricSTMicro commented 3 months ago

Hello,

I will be able to test it tomorrow only. I will let you know the result.

BR, Eric

EricSTMicro commented 3 months ago

Hello zhanglongxia,

I confirm your commit works fine as expected. Thank you for your reactivity!

BR, Eric