crownstone / bluenet

Bluenet is the in-house firmware on Crownstone hardware. Functions: switching, dimming, energy monitoring, presence detection, indoor localization, switchcraft.
https://crownstone.rocks
91 stars 62 forks source link

Request scan response from the mesh code #67

Closed mrquincle closed 3 years ago

mrquincle commented 5 years ago

Currently, advertising, scanning, and meshing is all done separately. This introduces competition with respect to the same radio peripheral in the end. The new mesh SDK has implemented code where this can (partly!) be done in a proper manner:

mrquincle commented 5 years ago

Note, it seems - needs to be verified - that the scan responses are actually passed through to the application. So, the first part of receiving scan responses is done.

Then the second part, about sending a scan request needs still to be added. Normally by adding .active to the corresponding scan struct.

static ble_gap_scan_params_t const m_scan_param =                                                                       
{                                                                                                                       
    .active        = 0x01, 
    ....
}

First thing to check is if this is somehow easy to add somewhere in the mesh code.

mrquincle commented 5 years ago

First implementation to do scanning implemented in a fork of the mesh SDK: https://github.com/mrquincle/nRF5-SDK-for-Mesh, namely with commit https://github.com/mrquincle/nRF5-SDK-for-Mesh/commit/f9f55d7279f1afa5fe0998b08a0bb074ddee2313

mrquincle commented 5 years ago

Implementation at https://github.com/mrquincle/nRF5-SDK-for-Mesh does now perform scan requests with latest commit. There are two things to be checked:

For now implementation is halted, after a while app_error_handler is triggered from restartAdvertising with error 8, which is NRF_ERROR_INVALID_STATE. Let's continue if there is no advertising done from outside the mesh code.

mrquincle commented 5 years ago

Implementation at https://github.com/mrquincle/nRF5-SDK-for-Mesh in scan-request branch now contains - much better - code from Nordic. :-)

mrquincle commented 5 years ago

Currently, what has already been checked:

This has been tested in the elevator, a nice Faraday cage. :-)

What has to be checked:

<t:    3443814>, net_beacon.c,  265, BEACON TX AF:11:FF:C4
<t:    3686631>, access.c,  426, TX: [aop: 0x00C0] 
<t:    3686635>, access.c,  427, TX: Msg: 0200000000
<t:    3771455>, net_beacon.c,  265, BEACON TX AF:11:FF:C4
<t:    4099135>, net_beacon.c,  265, BEACON TX AF:11:FF:C4
<error> app: Fatal error

With gdb:

Program received signal SIGTRAP, Trace/breakpoint trap.
0x0003a498 in app_error_fault_handler (id=<optimized out>, pc=<optimized out>, info=<optimized out>) at /home/anne/software/nrf5_sdk/nRF5_SDK_15.3.0_59ac345/components/libraries/util/app_error_weak.c:100
100     NRF_BREAKPOINT_COND;
(gdb) bt
#0  0x0003a498 in app_error_fault_handler (id=<optimized out>, pc=<optimized out>, info=<optimized out>) at /home/anne/software/nrf5_sdk/nRF5_SDK_15.3.0_59ac345/components/libraries/util/app_error_weak.c:100
#1  0x00033c4c in mesh_assertion_handler (pc=<optimized out>) at /home/anne/workspace/bluenet/source/src/util/cs_BleError.cpp:86
#2  0x0003ff38 in packet_buffer_commit (p_buffer=p_buffer@entry=0x200069ec <m_scanner+132>, p_packet=0x0, length=<optimized out>) at /home/anne/workspace/nRF5-SDK-for-Mesh/mesh/core/src/packet_buffer.c:274
#3  0x00043e24 in radio_handle_rx_end_event () at /home/anne/workspace/nRF5-SDK-for-Mesh/mesh/bearer/src/scanner.c:336
#4  scanner_radio_irq_handler () at /home/anne/workspace/nRF5-SDK-for-Mesh/mesh/bearer/src/scanner.c:507
#5  0x0004377e in bearer_handler_radio_irq_handler () at /home/anne/workspace/nRF5-SDK-for-Mesh/mesh/bearer/src/bearer_handler.c:441
#6  0x0003d08a in radio_signal_callback (sig=<optimized out>) at /home/anne/workspace/nRF5-SDK-for-Mesh/mesh/core/src/timeslot.c:460
#7  0x000246ba in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

This is the second assert in:

void packet_buffer_commit(packet_buffer_t * const p_buffer, packet_buffer_packet_t * const p_packet, uint16_t length)   
{                                                                                                                       
    NRF_MESH_ASSERT(NULL != p_buffer);                                                                                  
    NRF_MESH_ASSERT(NULL != p_packet);       

However, I've also seen it having trouble with the assert in free.

void packet_buffer_free(packet_buffer_t * const p_buffer, packet_buffer_packet_t * const p_packet)  {                                                                                                                       
    NRF_MESH_ASSERT(NULL != p_buffer);                                                                                  
    NRF_MESH_ASSERT(NULL != p_packet);
mrquincle commented 5 years ago

Also encountering:

app_error_handler (error_code=error_code@entry=8, line_num=line_num@entry=706, p_file_name=p_file_name@entry=0x51312 "/home/anne/workspace/bluenet/source/src/ble/cs_Stack.cpp") at /home/anne/workspace/bluenet/source/src/util/cs_BleError.cpp:32
32      __asm("BKPT");
(gdb) bt
#0  app_error_handler (error_code=error_code@entry=8, line_num=line_num@entry=706, p_file_name=p_file_name@entry=0x51312 "/home/anne/workspace/bluenet/source/src/ble/cs_Stack.cpp") at /home/anne/workspace/bluenet/source/src/util/cs_BleError.cpp:32
#1  0x00028534 in Stack::restartAdvertising (this=0x200027f8 <Stack::getInstance()::instance>) at /home/anne/workspace/bluenet/source/src/ble/cs_Stack.cpp:706
#2  0x00028ea0 in func::function<void (unsigned short)>::operator()(unsigned short) const (arguments#0=<optimized out>, this=0x2000283c <Stack::getInstance()::instance+68>) at /home/anne/workspace/bluenet/source/include/third/std/function.h:519
#3  Stack::on_connected (this=this@entry=0x200027f8 <Stack::getInstance()::instance>, p_ble_evt=p_ble_evt@entry=0x20003d68 <Timer::init()::APP_SCHED_BUF+4704>) at /home/anne/workspace/bluenet/source/src/ble/cs_Stack.cpp:1251
#4  0x00028b60 in Stack::onBleEvent (this=0x200027f8 <Stack::getInstance()::instance>, p_ble_evt=0x20003d68 <Timer::init()::APP_SCHED_BUF+4704>) at /home/anne/workspace/bluenet/source/src/ble/cs_Stack.cpp:992
#5  0x0003a3ee in app_sched_execute () at /home/anne/software/nrf5_sdk/nRF5_SDK_15.3.0_59ac345/components/libraries/scheduler/app_scheduler.c:280
#6  0x00037f64 in Crownstone::run (this=this@entry=0x2000ff3c) at /home/anne/workspace/bluenet/source/src/cs_Crownstone.cpp:756
#7  0x000384ea in main () at /home/anne/workspace/bluenet/source/src/cs_Crownstone.cpp:1032
#8  0x0002628e in _start ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) list
27  
28      const char * str_error __attribute__((unused)) = NordicTypeName(error);
29      LOGe("%s", str_error);
30      LOGf("FATAL ERROR 0x%X, at %s:%d", error, file, line);
31  
32      __asm("BKPT");
33      while(1) {}
34  }
35  
36  //! Called by softdevice
(gdb) p error
$1 = 8

That makes sense. The cs_Stack.cpp code calls sd_ble_gap_adv_set_configure while - the mesh code is - advertising. This should only be called when advertising has been stopped. If we want to set it while advertising, _adv_data.adv_data.p_data has to be a new buffer and the adv_params parameter has to be NULL.

mrquincle commented 5 years ago

Hypothesis of @vliedel is that the mesh code does not respect the timeslot it got allocated due to the scan response which sends out a request 150 us later.

mrquincle commented 5 years ago

The above code comes from an experimental patch at https://devzone.nordicsemi.com/support/229840, without guarantees and there are no intention to be supported in the standard mesh code according to https://devzone.nordicsemi.com/f/nordic-q-a/38162/ble-scan-packet-receive-latency-when-done-along-side-mesh. This means that it might or might not fall under QDID 111537 which has been filed for version 2.0.1. Newer versions "should" still fall under this, however unofficial patches like above do not.

In practice we haven't needed it. This might have several reasons:

If remembered correctly, it is the third reason.

mrquincle commented 5 years ago

This is not remembered correctly.

Hence, this issue should not be closed, but solved!

mrquincle commented 3 years ago

This patch is maintained in the fork at https://github.com/crownstone/nrf5-sdk-for-mesh.

Hopefully one day the scan requests will be sent from the default nRF5 SDK for mesh. However, there are a few other reasons to maintain a fork (e.g. with respect to us persisting variables outside of the mesh stack). Hence, it's likely that even in that case, a fork will have to be maintained for a much longer time.