stm32duino / STM32duinoBLE

ArduinoBLE library fork to support ST BLE modules
GNU Lesser General Public License v2.1
15 stars 16 forks source link

I2C disconnects BLE #58

Open thijses opened 1 year ago

thijses commented 1 year ago

Using the STM32WB55, using I2C (especially using it frequently/intensively) will disconnect BLE devices, and sometimes even crash the whole microcontroller. Presumably, this is caused by an overbearing HAL_LOCK or interrupt disable in the I2C libraries (twi.h refers back to stm32wbxx_hal_i2c.h). PRINT_IPCC_INFO does not print anything when it disconnects. Is there an immediately obvious way to solve/debug this? I was hoping to avoid getting into the low-level stuff with the STM platform, but i suspect i might need to in order to find this one.

thijses commented 1 year ago

sorry, i realized i should probably be a little more specific: when trying to transmit BLE packets and read/write I2C simultaneously, the whole microcontroller hangs for a few seconds and the BLE connection (to the central) fails. If using I2C without also transmitting BLE packets, the connection appears to survive.

fpistm commented 1 year ago

Well BLE for WB using shared mem and IRQ. Lot of __disable_irqare used in the sharedmem. As I don't have your code and based on your inputs, it is not a good idea to perform "simultaneously" BLE action and I2C read as I2C is based on IT.

thijses commented 1 year ago

i know 'simultanious' is not really true on this kind of semi-single-core microcontroller, but how does one check whether the BLE packets have all finished transmitting? (I am using https://github.com/thijses/Arduino-HardwareBLESerial which is (a fork of) a library that just blasts serial data (for easy debugging) over 20-byte BLE packets. This library uses the (legacy(?)) characteristic.setValue(), which refers to writeValue(). Is there a function i can use to wait until all the data in the transmit buffer is successfully transmitted? (or am i fundamentally misunderstanding how the HCI connection to the coprocessor works?)

fpistm commented 1 year ago

Could you share your full code. Core version, STM32duinoBLE version, STM32WB Copro Wireless Binary version. I gues your board is the Nucleo WB55?

thijses commented 1 year ago

yes, of course (sorry): Copro stack: stm32wb5x_BLE_HCILayer_fw.bin version 1.16.0 platformIO details: PLATFORM: ST STM32 (15.4.1) > P-Nucleo WB55RG HARDWARE: STM32WB55RG 64MHz, 192KB RAM, 512KB Flash PACKAGES:

i was using STM32duinoBLE version 1.2.3, but i just tested and 1.2.4 shows the same behaviour

the whole code is excessively large for this question, and technically under NDA. But i'll scrap together some minimal code to reproduce the issue on my end (not that you could actually repeat the hardware without the I2C devices on my PCB).

right now i have to go, but i'll be back in a few days to test some more (i just remembered BLE.debug() exists)

fpistm commented 1 year ago

Pio uses an old version of the core and we do not support it. WB cube and the stack used here are not aligned.

thijses commented 1 year ago

first of all, thank you for your continued patience;

The issue still persists, but i've got some more hints, now that i've actually enabled BLE.debug(Serial):

edit: turns out NDEBUG was not defined, BLE.debug() was still disabled. When i enable BLE.debug() as well, it crashes when it has to transmit anything demanding over BLE. I'll just stick to 1 debug mode at-a-time for now.

When it crashes (during extended I2C tests, while it's also trying to print something to the bleSerial (so presumably some packets are still queued up for transmission), it hangs for several seconds, then it printed the following (unique) debug line:

ble evt: 0x05 payload: 00 01 08 08 mm evt released: 0x05 buffer addr: 0x20030030

edit: after enabling BLE.debug() (instead of undefining NDEBUG), it spits out the following:

HCI EVENT RX <- 04050400010808 HCI COMMAND TX -> 010A200101 HCI EVENT RX <- 0413050101080100 HCI EVENT RX <- 0413050101080100 HCI EVENT RX <- 0413050101080100 HCI EVENT RX <- 0413050101080100 HCI EVENT RX <- 040E04010A2000 HCI ACLDATA TX -> 0201081B00170004001B0E006D70200920493243657272436F756E7465723A20

I tried to find what evtcode 0x05 represents, but i think i may be a little out of my depth already. Is it possible that the I2C peripheral code uses similar event (software interrupt?) handlers, but that they overlap, and so the BLE event catcher accidentally caught some of the I2C events (or vice versa)? (truly a guess, not even remotely educated)

_on a completely unrelated note; the links in src/utility/STM32Cube_FW/README.md are broken, because there's 3 'v's instead of 1 (in multiple places). I noticed that in an earlier release (1.15.0) there were also 2 'v's instead of 1._

thijses commented 1 year ago

little update: i have made some small steps towards understanding the inner-workings of the STM32duinoBLE code. There's a few callbacks, buffers and abstraction layers to get through, but in HCI.cpp a lot of the important stuff is revealed. stuff like:

#define EVT_DISCONN_COMPLETE 0x05
#define EVT_NUM_COMP_PKTS    0x13

which answers my questions about that the eventcodes mean, which is nice. Unfortunately, evtcode 0x05 is just a disconnection, which doesn't really help me identify why it disconnects. However, after following the path of the 0x13 evtcode, i found _pendingPkt. This handy variable allows me to wait for BLE packets to be 100% done transmitting before doing I2C interactions (to avoid interrupt overlap).

However, this did NOT solve my problem :( , it seems that simply the act of doing ~100 I2C read/writes in rapid succession (during a time when the HCI is quiet) causes the BLE to die. I'm not sure whether the act of sending another BLE packet after the I2C stuff is what triggers the disconnect, or if that's just how the disconnect is discovered. Either way, there must be some deeper issue than just competing interrupts.

fpistm commented 1 year ago

on a completely unrelated note; the links in src/utility/STM32Cube_FW/README.md are broken, because there's 3 'v's instead of 1 (in multiple places). I noticed that in an earlier release (1.15.0) there were also 2 'v's instead of 1.

Thanks, missed that, I' fix it and updated the script which update it automatically: https://github.com/stm32duino/Arduino_Core_STM32/commit/2e754893e1c5d43096558e8e973ccf5d741a28b1

About your issue I have no clue and will not have time soon to try to reproduce nor debug. As stated, my first thought was an issue with IRQ which block BLE and probably disconnect it as it considered not maintained or something similar.

thijses commented 1 year ago

absolutely understandable, and no problem. I'm going to continue digging for the true cause of the issue (which, may well be hardware after all). If i find something definitive, i'll let you know.