apache / mynewt-nimble

Apache mynewt
https://mynewt.apache.org/
Apache License 2.0
685 stars 393 forks source link

controller: timing error and processing time consideration in `ble_ll_conn_can_send_next_pdu()` #1808

Open sada45 opened 1 month ago

sada45 commented 1 month ago

Hi ~ The function ble_ll_conn_can_send_next_pdu() in ble_ll_conn.c is used by the Central to decide wether to reply the Peripheral so that the Peripheral can send more data packets. Since the Central does not know how many data needs to be transmitted, it always assume the Peripheral to send a maximum length data. I am using a older version of NimBLE and the time of the maximum length is connsm->eff_max_rx_time is 2120 us. Since we do not use the MIC field, we modified it to 2088 us to fully utilize the connection event. It seems that connsm->ota_max_rx_time in the lastest version already remove the MIC field.

However, because of the timing error and the processing time, the Central may miss the reply packet from the Peripheral. Specifically, if the end transmission time of the Peripheral reply is very close to the start time of the connection event, the reply packet will not be processed even it has been received. For example, in our experiment, the ticks in ble_ll_conn.c Line 1698 is 150 (i.e., 4577 us). usec is 2468 us, including an empty PDU, a full PDU without MIC (2088us), IFS, and MSS. The added_usec is 2106 us. Therefore, the next event start is only 3 us after the reply packet transmission. Because of the timing error and the processing time, the reply can not be fully processed before the start of the next event. The Peripheral has to retransmit the reply in the next connection event.

The most straigeforward way is adding a redundancy time considering the timing error and processing time. If we still set the connsm->eff_max_rx_time to 2120 us, the Centeral choose not to transmit an extra empty packet since the Peripheral does not uses the MIC field, it acutall brings 32 us redundancy time.

andrzej-kaczmarek commented 1 month ago

there's BLE_LL_CONN_EVENT_END_MARGIN syscfg added exactly for this purpose

sada45 commented 1 month ago

there's BLE_LL_CONN_EVENT_END_MARGIN syscfg added exactly for this purpose

Oh, BLE_LL_CONN_EVENT_END_MARGIN seems to be a new feature in the later version. I check the latest ble_ll_conn_get_next_sched_time() function and I think wether it is necessary to consider the BLE_LL_CONN_EVENT_END_MARGIN after the Line 953 ce_end = next_sched_time;? Since the issue I find is that the reply packet receiving is interruptted by the connection event of other connections. We should reserve the time for timing error and processing.

andrzej-kaczmarek commented 1 month ago

value assigned to ce_end in line 953 is the start time of next scheduled item which is different than current connection so it does not interfere with current connection event, i.e. even if we tx/rx packet at the very end of connection event, it can be processed while next scheduled item is being handled

ce_end calculations done before that for actual connection event end are adjusted for BLE_LL_CONN_EVENT_END_MARGIN because after tx/rx of last packet in connection event we need some time to reschedule for next connection event

sada45 commented 1 month ago

I still believe the start of the connection event of another connection can interrupt the current recepition.

Assume because of the timing error, the rx_end_isr is issued a little bit late after the start of the next scheduled item of anther connection. In ble_ll_sched_execute_item() function (Line 976 in ble_ll_sched.c), it always disable the physical layer first, which disable the interrupt. After that, even the rx_end_isr is issued and the it will not pass the check at Line 1490 in ble_phy.c. Therefore, once the next scheduled item of another connection is being handled, the reply packet can not be processed