ARMmbed / mbed-os

Arm Mbed OS is a platform operating system designed for the internet of things
https://mbed.com
Other
4.66k stars 2.97k forks source link

BLE Advertising Stops After Few Minutes (when CPU is idle) #15248

Open ipener opened 2 years ago

ipener commented 2 years ago

Description of defect

Mbed OS seems to stop advertising over Bluetooth® LE after the CPU is idle for about 5-10 minutes. The issue occurs on an Arduino Nano 33 BLE with a nRF52840 SoC which has support for low power mode. I suspect that Mbed OS enters sleep/deep sleep mode after some time and is unable to wake up from it. To prevent the system from going into sleep I’ve tried calling sleep_manager_lock_deep_sleep as well as implementing a no-op idle hook and calling rtos::Kernel::attach_idle_hook(&idle_hook) before advertising as suggested here but unfortunately the problem persists.

Target(s) affected by this defect ?

ARDUINO_NANO33BLE

Toolchain(s) (name and version) displaying this defect ?

Arduino Nano 33 BLE, nRF52840 SoC

What version of Mbed-os are you using (tag or sha) ?

mbed-os-6.15.1

What version(s) of tools are you using. List all that apply (E.g. mbed-cli)

macOS 12.2.1, Mbed Studio 1.4.3 (1.4.3.1), ARMC6 compiler, LightBlue®/iOS CoreBluetooth (central scanning for the peripheral)

How is this defect reproduced ?

Here’s the minimal code (without any error handling) to replicate the issue:

#include "mbed.h"
#include "BLE.h"
#include "events/EventQueue.h"

using namespace ble;
using namespace mbed;

static uint8_t advData[31] = {
    0x02, 0x01, 0x06, 0x12, 0xFF, 0x4C, 0x00, 0x06,
    0x2D, 0x01, 0xFE, 0x6D, 0x74, 0x72, 0xD9, 0xCC,
    0x05, 0x00, 0x01, 0x00, 0x01, 0x02, 0x08, 0x08,
    0x41, 0x63, 0x6D, 0x65, 0x20, 0x4C, 0x69
};
static events::EventQueue eq;

void advertise() {
    AdvertisingParameters params(
        advertising_type_t::CONNECTABLE_UNDIRECTED,
        adv_interval_t(millisecond_t(20))
    );

    auto &ble = BLE::Instance();
    auto &gap = ble.gap();

    gap.setAdvertisingParameters(LEGACY_ADVERTISING_HANDLE, params);
    gap.setAdvertisingPayload(LEGACY_ADVERTISING_HANDLE, { advData, sizeof(advData) });
    gap.startAdvertising(LEGACY_ADVERTISING_HANDLE);
}

struct EventHandler: public Gap::EventHandler {
    void onDisconnectionComplete(const DisconnectionCompleteEvent &event) override {
        advertise();
    }
};

void onInitComplete(BLE::InitializationCompleteCallbackContext *event) {
    if (!event->error) {
        advertise();
    }
}

void scheduleEvents(BLE::OnEventsToProcessCallbackContext *event) {
    eq.call(&event->ble, &BLE::processEvents);
}

int main() {
    EventHandler handler;

    auto &ble = BLE::Instance();
    auto &gap = ble.gap();

    gap.setEventHandler(&handler);
    ble.onEventsToProcess(scheduleEvents);
    ble.init(onInitComplete);

    eq.dispatch_forever();
}

I’ve used LightBlue® to connect to the Arduino. If I keep connecting/disconnecting every 2-3 minutes the system doesn’t seem to go into sleep mode. However, if I keep it disconnected for 5-10 minutes it’s no longer discoverable. This clearly seems like a bug to me since I didn't observe any issues running a similar minimal program built with the nRF Connect SDK. Should this be a mere configuration issue then feel free to close the bug and reply here.

ghost commented 2 years ago

Did you have any such problems with the examples here? https://github.com/ARMmbed/mbed-os-example-ble/ I used variations of those and didn't have any such problems on nrf5(1|2) chips.

ipener commented 2 years ago

Yes, the issue persists also with the BLE examples. I've just tried the simplest one, BLE_Advertising, and the advertising stops after 5-10min.

ghost commented 2 years ago

I wonder if that's why https://github.com/ARMmbed/mbed-os-ble-utils/issues/10 for this utility library that was the last serious attempt at putting "framework" for simple use cases on top of the ble apis.

ipener commented 2 years ago

I think this is unrelated since neither the minimal code I posted above nor the BLE_Advertising example actually use mbed-os-ble-utils.

ghost commented 2 years ago

It's not unrelated in the fact that I've used both of those things and didn't have the problem you're having. My current approach is based on the mbed-os-ble-utils, so I have more recent experience of your problem not happening with that library than the official examples. That's the only reason i suggested it.

ipener commented 2 years ago

Any update on this? Is anyone looking into this?

Gerriko commented 2 years ago

I came across this issue, and I'm rather curious myself as to the cause. In my experience these type of problems require a few random tests to rule in or out possible issues.

So Question 1. Does the behaviour change if you never connect to the advertising device versus if you connect once/more than once, then disconnect and then wait for the 5-10 mins before trying again.

Then for my 2nd question, I was wondering if the time it took before you could no longer connect changed if you modified adv_interval_t(millisecond_t(20)) to say adv_interval_t(millisecond_t(200)) and then say adv_interval_t(millisecond_t(2000)). Made no difference or it changed.

ipener commented 2 years ago

Thanks for the suggestions but unfortunately the behaviour stays the same in both cases. So far the only way to get the device to keep advertising is to periodically connect to it before the ~5min threshold is reached. As mentioned above, there are no issues when using the nRF Connect SDK.

Gerriko commented 2 years ago

So far the only way to get the device to keep advertising is to periodically connect to it before the ~5min threshold is reached.

I have a hunch that the process of frequent connecting / disconnecting is what's causing your problem. I uploaded your code onto an old Particle Xenon dev board (nRF52840) and left the device well alone and simply checked every now and then using a phone app to scan for devices to see if all ok. It is now 15 mins and ticking, as I write this response, and my Xenon device is still advertising every 22ms.

ipener commented 2 years ago

I have a hunch that the process of frequent connecting / disconnecting is what's causing your problem.

Not sure I follow the hunch. The frequent connecting/disconnecting is not something I purposefully want to do but so far it's the only way I manage to keep the device from going into deep sleep.

I uploaded your code onto an old Particle Xenon dev board (nRF52840) and left the device well alone and simply checked every now and then using a phone app to scan for devices to see if all ok. It is now 15 mins and ticking, as I write this response, and my Xenon device is still advertising every 22ms.

That's good to know and could mean it's a board-specific problem. This is a long shot but could this be some board-specific configuration that Mbed Studio running as part of the build. I know there's a bunch of code to wire up pins, set up peripherals, layout ROM, etc. being compiled and then executed when Mbed OS boots up the program. Maybe some registers need to be set such that BLE activity will wake up the board. According to the official specs the nRF52840 SoC does support this.

Gerriko commented 2 years ago

Sounds like you might be trying to solve this through reasoning alone rather than testing any/all assumptions made. My hunch was based on my understanding that you can only have one BLE instance. This is just a hunch - and I cannot validate. So here is some alternative code that does away with auto. Worth a test on your side to see if any difference.

/* mbed Microcontroller Library
 * Copyright (c) 2019 ARM Limited
 * SPDX-License-Identifier: Apache-2.0
 */

#include "mbed.h"

#include "BLE.h"
#include "events/EventQueue.h"

using namespace ble;
using namespace mbed;

static uint8_t advData[31] = {
    0x02, 0x01, 0x06, 0x12, 0xFF, 0x4C, 0x00, 0x06,
    0x2D, 0x01, 0xFE, 0x6D, 0x74, 0x72, 0xD9, 0xCC,
    0x05, 0x00, 0x01, 0x00, 0x01, 0x02, 0x08, 0x08,
    0x41, 0x63, 0x6D, 0x65, 0x20, 0x4C, 0x69
};

static events::EventQueue eq;
static BLE &_ble = BLE::Instance();

void advertise() {
    AdvertisingParameters params(
        advertising_type_t::CONNECTABLE_UNDIRECTED,
        adv_interval_t(millisecond_t(20))
    );

    _ble.gap().setAdvertisingParameters(LEGACY_ADVERTISING_HANDLE, params);
    _ble.gap().setAdvertisingPayload(LEGACY_ADVERTISING_HANDLE, { advData, sizeof(advData) });
    _ble.gap().startAdvertising(LEGACY_ADVERTISING_HANDLE);
}

struct EventHandler: public Gap::EventHandler {
    void onDisconnectionComplete(const DisconnectionCompleteEvent &event) override {
        advertise();
    }
};

void onInitComplete(BLE::InitializationCompleteCallbackContext *event) {
    if (!event->error) {
        advertise();
    }
}

void scheduleEvents(BLE::OnEventsToProcessCallbackContext *event) {
    eq.call(&event->ble, &BLE::processEvents);
}

int main() {
    EventHandler handler;

    _ble.gap().setEventHandler(&handler);
    _ble.onEventsToProcess(scheduleEvents);
    _ble.init(onInitComplete);

    eq.dispatch_forever();
}
ipener commented 2 years ago

Thanks again for the input. BLE is a singleton and as such the static instance() method should return the same instance all the time. I've tried your solution as well as an alternative where _ble is not a global variable but a local one which is initialized in the main() function — unfortunately though without success. I've timed it better this time and it appears that the advertising stops after around 8’45”. Attached is a screen recording of the last 20s showing how the Acme Li becomes undiscoverable.

https://user-images.githubusercontent.com/98188370/167451581-4d93f655-050e-42c2-a9c6-d368faa5b5c6.mp4

ipener commented 1 year ago

Any updates on this issue?