embassy-rs / nrf-softdevice

Apache License 2.0
269 stars 80 forks source link

Inappropriate (?) panic from ble::connection::ConnectionState::disconnect() in response to NRF_ERROR_INVALID_STATE #281

Open peter9477 opened 1 week ago

peter9477 commented 1 week ago

The code in ConnectionState::disconnect() calls disconnect_with_reason() which unwraps any error result from the call to sd_ble_gap_disconnect(). That code in bindings.rs is documented as potentially returning success, or one of 3 possible errors. Two of the errors may legitimately be considered internal failures that should never happen, but it appears the third, NRF_ERROR_INVALID_STATE can occur in normal operation, though under unusual circumstances. To put it another way, it appears it can occur in a way that can't be anticipated, but doesn't represent a system failure deserving a panic. Unfortunately it currently panics because of the unwrap(), and I'm not sure there would be any way to avoid it.

#[doc = " @retval ::NRF_SUCCESS The disconnection procedure has been started successfully."]
#[doc = " @retval ::NRF_ERROR_INVALID_PARAM Invalid parameter(s) supplied."]
#[doc = " @retval ::BLE_ERROR_INVALID_CONN_HANDLE Invalid connection handle supplied."]
#[doc = " @retval ::NRF_ERROR_INVALID_STATE Disconnection in progress or link has not been established."]
#[inline(always)]
pub unsafe fn sd_ble_gap_disconnect(conn_handle: u16, hci_status_code: u8) -> u32 {

We've observed this once while a unit was connected to a host via WebBluetooth, and the host eventually had a browser crash which led to this panic occurring on the device.

This one error should probably be caught and turned into a different valid error response. Possibly we should just map it to the existing DisconnectedError, since it's probably very rare and that covers the second half of the "disconnection in progress or link has not been established" that InvalidState supposedly covers. The "disconnection in progress" part may just be a report on what amounts to an internal (safe) race condition between trying to disconnect while something (like the peer) has already triggered a disconnect which is currently in progress.

Anyone have thoughts on this?

alexmoon commented 1 week ago

Yes, I agree that DisconnectedError should be returned in that case. nrf_softdevice tries to ensure the connection is in a valid state before calling sd_ble_gap_disconnect, but if an internal disconnection happens and ConnectionState::disconnect() is called before the disconnection event is processed you could get that error.