embassy-rs / nrf-softdevice

Apache License 2.0
256 stars 76 forks source link

Panic when a connection is initiated #175

Closed mehmetalianil closed 1 year ago

mehmetalianil commented 1 year ago

Summary

Some change on bcd957ee81b31f70cc402fbab438d5d2ff526c45 results in a panic when advertisement starts. After some change on 2a086aa3a87de27297b6de8bd07f7412697e057c this was eliminated, only to panic only when an advertisement is connected to. After a RawError(InvalidState) This results in an advertisement timeout. We froze our nrf-softdevice requirement at 36acd5002b4fd41adb5e3c413019b3c8c7cf4bb2, which is the latest commit that does not panic.

Code

What we do differently is that we dynamically change the advertisement timeout and advertisement interval. Therefore advertisement drops periodically and bluetooth task loop initiates it again periodically. Other than that we have too many characteristics (50 and counting)

       let advresult = with_timeout(Duration::from_millis(server.rst.advtimeout_get().unwrap() as u64), peripheral::advertise_connectable(sd, adv, &config)).await;
        match advresult {
            Ok(result) => {
        let conn = unwrap!(result);
        if let Ok(mut conn_mut_ref) = conn_ref.try_borrow_mut()  {
            conn_mut_ref.replace(conn.clone());
        };
        server.rst.connstate_set(&true).expect("Couldn't set connstate to server");
        info!("advertising done!");

RawError(InvalidState) logs

43.954406 INFO  Attempting Advertisement!
└─ bart_firmware::tasks::bluetooth::__bluetooth_task_task::{async_fn#0} @ src\tasks\bluetooth.rs:104
43.954437 INFO  config interval is 250
└─ bart_firmware::tasks::bluetooth::__bluetooth_task_task::{async_fn#0} @ src\tasks\bluetooth.rs:105
44.438507 TRACE ble evt 16
└─ nrf_softdevice::ble::on_evt @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:112
44.438537 DEBUG conn_params conn_sup_timeout=500 max_conn_interval=32 min_conn_interval=32 slave_latency=0
└─ nrf_softdevice::ble::gap::on_evt @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:125
44.438598 DEBUG connected role=Peripheral peer_addr=RandomPrivateResolvable:[d, 5a, f0, 52, 46, 5b]
└─ nrf_softdevice::ble::peripheral::advertise_inner::{async_fn#0}::{closure#1} @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:125
44.438690 TRACE conn 0: connected
└─ nrf_softdevice::ble::connection::{impl#7}::new::{closure#0} @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:112
44.550262 TRACE ble evt 36
└─ nrf_softdevice::ble::on_evt @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:112
44.550323 DEBUG on_data_length_update conn_handle=1 max_rx_octets=27 max_rx_time_us=328 max_tx_octets=251 max_tx_time_us=2120
└─ nrf_softdevice::ble::gap::on_evt @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:125
45.030395 TRACE ble evt 18
└─ nrf_softdevice::ble::on_evt @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:112
45.030426 DEBUG on_conn_param_update conn_handle=1 conn_sup_timeout=500 max_conn_interval=6 min_conn_interval=6 slave_latency=0
└─ nrf_softdevice::ble::gap::on_evt @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:125
46.578002 TRACE ble evt 18
└─ nrf_softdevice::ble::on_evt @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:112
46.578033 DEBUG on_conn_param_update conn_handle=1 conn_sup_timeout=500 max_conn_interval=32 min_conn_interval=32 slave_latency=0
└─ nrf_softdevice::ble::gap::on_evt @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:125
50.451141 DEBUG VDD Voltage: 3.3152342
└─ bart_firmware::tasks::battery::__battery_task_task::{async_fn#0} @ src\tasks\battery.rs:19
51.454528 WARN  sd_ble_gap_adv_stop: InvalidState
└─ nrf_softdevice::ble::peripheral::advertise_inner::{async_fn#0}::{closure#0} @ C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\nrf-softdevice-03ef4aef10e777e4\79fb6f3\nrf-softdevice\src\fmt.rs:151

Log panic without RawError

9.950073 INFO  Attempting Advertisement!
└─ bart_firmware::tasks::bluetooth::__bluetooth_task_task::{async_fn#0} @ src\tasks\bluetooth.rs:104
9.950103 INFO  config interval is 250
└─ bart_firmware::tasks::bluetooth::__bluetooth_task_task::{async_fn#0} @ src\tasks\bluetooth.rs:105
────────────────────────────────────────────────────────────────────────────────
stack backtrace:
   0: HardFaultTrampoline
      <exception entry>
   1: bart_firmware::tasks::bluetooth::__bluetooth_task_task::{{closure}}
   2: embassy_executor::raw::TaskStorage<F>::poll
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\raw\mod.rs:161:15   
   3: core::cell::Cell<T>::get
        at C:\Users\Mehmet Ali Anil\.rustup\toolchains\nightly-2023-02-07-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\cell.rs:452:18
   4: embassy_executor::raw::timer_queue::TimerQueue::update
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\raw\timer_queue.rs:30:12
   5: embassy_executor::raw::Executor::poll::{{closure}}
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\raw\mod.rs:422:17   
   6: embassy_executor::raw::run_queue::RunQueue::dequeue_all
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\raw\run_queue.rs:70:13
   7: embassy_executor::raw::Executor::poll
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\raw\mod.rs:395:13   
   8: embassy_executor::arch::Executor::run
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\arch\cortex_m.rs:55:17
   9: bart_firmware::____embassy_main_task::{{closure}}
        at src\main.rs:598:5
  10: embassy_executor::raw::TaskStorage<F>::poll
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\raw\mod.rs:161:15   
  11: core::cell::Cell<T>::get
        at C:\Users\Mehmet Ali Anil\.rustup\toolchains\nightly-2023-02-07-x86_64-pc-windows-msvc\lib\rustlib\src\rust\library\core\src\cell.rs:452:18
  12: embassy_executor::raw::timer_queue::TimerQueue::update
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\raw\timer_queue.rs:30:12
  13: embassy_executor::raw::Executor::poll::{{closure}}
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\raw\mod.rs:422:17   
  14: embassy_executor::raw::run_queue::RunQueue::dequeue_all
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\raw\run_queue.rs:70:13
  15: embassy_executor::raw::Executor::poll
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\raw\mod.rs:395:13   
  16: embassy_executor::arch::Executor::run
        at C:\Users\Mehmet Ali Anil\.cargo\git\checkouts\embassy-9312dcb0ed774b29\0555970\embassy-executor\src\arch\cortex_m.rs:55:17
  17: bart_firmware::__cortex_m_rt_main
        at src\main.rs:452:1
  18: main
        at src\main.rs:452:1
  19: Reset
  (HOST) WARN  call stack was corrupted; unwinding could not be completed

Other data

Tested every commit.

7e13a0dcb6ba2ee0ebecb55a9cbfc87016ec4761->works 0b1cafe91900fb004758d27f29cd0091db36069b -> works 7989ce36bc30e2b8fa2ab0ed95d0774bb69d9e07-> works 36acd5002b4fd41adb5e3c413019b3c8c7cf4bb2 -> works bcd957ee81b31f70cc402fbab438d5d2ff526c45 -> panics before connection b6514913d9cd44e518b4310f2b8140e0e132b0f5 -> panics before connection 2a086aa3a87de27297b6de8bd07f7412697e057c -> panics with InvalidState 7b4d9d02b81b7261c1758719b9cf0a7f4b986ddb -> panics with InvalidState 5e00dbc77747cf53a54c3ea1f5b95d61a8bc086f -> panics with InvalidState

Yandrik commented 1 year ago

Thanks for the comprehensive tests. My portal rewrite probably threw a wrench in there somewhere. I'll take a look later

(bcd957e was a complete rewrite and restructure of the portal, so there's some deadlock or something happening there that shouldn't exist. The reintroduction of DropBombs causing it to panic gives me a hint as to what it could be)

mehmetalianil commented 1 year ago

Yes, this fix resolved my issue. Thank you so much! I am not sure how you actually caught the bug without making me reproduce again on my machine. (Sorry, I was away.) Will check the chat logs now.