muka / go-bluetooth

Golang bluetooth client based on bluez DBus interfaces
Apache License 2.0
653 stars 124 forks source link

Sending to closed channel if dbus is lagging in high traffic environments #127

Closed 1am closed 3 years ago

1am commented 3 years ago

Hello,

I've been experiencing some random crashes on closed channels over the time of using the library. Tried to refactor my application code around the library in various aspects, mainly handling high volumes of BLE scan results, properly handling connection being terminated by peripheral side of the connection in random moments. For the most of it it was possible to rewrite the part to eventually handle these edge cases but still when running tests, once per ~5k tests the application crashes with:

panic: send on closed channel

goroutine 281260 [running]:
github.com/muka/go-bluetooth/bluez.WatchProperties.func1(0xc0e140, 0x306840, 0x8eb1a0, 0x8e0080)
        /home/..../go-bluetooth/bluez/props.go:95 +0x6a8
created by github.com/muka/go-bluetooth/bluez.WatchProperties
        /home/..../go-bluetooth/bluez/props.go:34 +0x1c0

This happened in a few places and I think there is no way around this except for suppressing the panic which I've fixed in https://github.com/muka/go-bluetooth/pull/123/files So far it seems to be working well.

To give you some example scenarios of the tests performed which manifested the issues.

Scanning test with hundreds hundreds of unique advertisers per second

  1. Start BLE scanning
  2. Wait two seconds
  3. Stop BLE scanning

This crashed very frequently because even after step 3 some advertisements came in afterwards. The channel was obviously closed so there was a reason for this but the application was taken down by the closed channel write

Terminating connection by peripheral

  1. Connect to specific device
  2. Subscribe for notifications
  3. Read/write some data
  4. Unsubscribe
  5. Disconnect

The random reset of peripheral device was triggered somewhere between steps 2-4 which is where I found that it was most problematic for the application I'm working on. Eventually it was fixed after some effort but still it was possible to get a characteristic notification "Notify" true being sent over DBUS long after device's "Connected" false was handled by the application. This happens both on quite powerful PC with I7 and 16GB ram as well 400MHz and 64MB ram embedded SOM on both of which are my target platforms.

I think these kinds of situations are not possible to address in any other way than being able to carry on after they happened.

muka commented 3 years ago

Hi, I just merge your PR #123. Let me know if you have any update on that situation. If you have a different approach to avoid the problem I would be glad to discuss and integrate to he codebase. Thanks!

1am commented 3 years ago

Thank you. So far don't really have any better idea. I think it's more dependent on events coming from DBus in random order than anything that can be done by the library. But if I find something will let you know for sure