andersonshatch / soma-ctrl

Node util for controlling SOMA smart shade via MQTT or HTTP
MIT License
21 stars 6 forks source link

Command Disallowed 0xc causes a device to become unresponsive #59

Closed mwitkow closed 5 years ago

mwitkow commented 5 years ago

Got two devices set up on Rapsbian 18.04.

After a certain period of time, the following error occurs:

May 25 21:11:12 elysium somactrl[25680]: noble: unknown peripheral null connected!
May 25 21:11:55 elysium somactrl[25680]: 2019-05-25T21:11:55.361Z soma:ca065a773594 disconnected
May 25 21:11:55 elysium somactrl[25680]: 2019-05-25T21:11:55.362Z soma:ca065a773594 connected for 184.284 seconds
May 25 21:11:55 elysium somactrl[25680]: 2019-05-25T21:11:55.471Z soma:ca065a773594 Error: Command Disallowed (0xc)
May 25 21:11:55 elysium somactrl[25680]:     at NobleBindings.onLeConnComplete (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/bindings.js:220:13)
May 25 21:11:55 elysium somactrl[25680]:     at emitOne (events.js:116:13)
May 25 21:11:55 elysium somactrl[25680]:     at Hci.emit (events.js:211:7)
May 25 21:11:55 elysium somactrl[25680]:     at Hci.processCmdStatusEvent (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:674:12)
May 25 21:11:55 elysium somactrl[25680]:     at Hci.onSocketData (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:469:12)
May 25 21:11:55 elysium somactrl[25680]:     at emitOne (events.js:116:13)
May 25 21:11:55 elysium somactrl[25680]:     at BluetoothHciSocket.emit (events.js:211:7)
May 25 21:11:55 elysium somactrl[25680]: 2019-05-25T21:11:55.472Z soma:ca065a773594 connected!
May 25 21:11:55 elysium somactrl[25680]: noble warning: unknown peripheral ca065a773594
May 25 21:12:08 elysium somactrl[25680]: 2019-05-25T21:12:08.205Z soma:eebe28d042aa disconnected

and earlier this happened to the other device:

ay 25 19:49:19 elysium somactrl[24059]: 2019-05-25T19:49:19.420Z soma:eebe28d042aa disconnected
May 25 19:49:19 elysium somactrl[24059]: 2019-05-25T19:49:19.420Z soma:eebe28d042aa connected for 182.086 seconds
May 25 19:49:19 elysium somactrl[24059]: 2019-05-25T19:49:19.533Z soma:eebe28d042aa Error: Command Disallowed (0xc)
May 25 19:49:19 elysium somactrl[24059]:     at NobleBindings.onLeConnComplete (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/bindings.js:220:13)
May 25 19:49:19 elysium somactrl[24059]:     at emitOne (events.js:116:13)
May 25 19:49:19 elysium somactrl[24059]:     at Hci.emit (events.js:211:7)
May 25 19:49:19 elysium somactrl[24059]:     at Hci.processCmdStatusEvent (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:674:12)
May 25 19:49:19 elysium somactrl[24059]:     at Hci.onSocketData (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:469:12)
May 25 19:49:19 elysium somactrl[24059]:     at emitOne (events.js:116:13)
May 25 19:49:19 elysium somactrl[24059]:     at BluetoothHciSocket.emit (events.js:211:7)
May 25 19:49:19 elysium somactrl[24059]: 2019-05-25T19:49:19.534Z soma:eebe28d042aa connected!
May 25 19:49:19 elysium somactrl[24059]: noble warning: unknown peripheral eebe28d042aa
May 25 19:49:22 elysium somactrl[24059]: 2019-05-25T19:49:22.996Z soma:ca065a773594 requesting move to 100
May 25 19:49:23 elysium somactrl[24059]: 2019-05-25T19:49:23.015Z soma:eebe28d042aa requesting move to 100

After that, the particular affected device becomes completely unresponsive and stops reacting to MQTT or HTTP endpoint requests and the somactrl needs to be restarted.

I am not a nodejs wizz, but this seems like a case of a simple exception handling?

mwitkow commented 5 years ago

It seems that this could be related to the fact that Rapsberry Pi bluetooth dongle doesn't allow scanning and sending a command at the same time https://github.com/noble/noble/issues/658

mwitkow commented 5 years ago

It seems like Rapsbian and Pi can only support up to 5 devices (I have 4, with 2 other BTLE Xiomi thermometers). But according to https://github.com/noble/noble/issues/260 it sometimes causes snafu.

It seems that errors happen on certain attempts to reconnect. It seems that the problem with the coma-ctrl code is the fact that after an error occurs in connect, the code continues trying to read from a device. See here: https://github.com/andersonshatch/soma-ctrl/blob/master/src/SomaShade.js#L136 prints the error

May 25 19:49:19 elysium somactrl[24059]: 2019-05-25T19:49:19.533Z soma:eebe28d042aa Error: Command Disallowed (0xc)

and then the next line happily printts "connected": https://github.com/andersonshatch/soma-ctrl/blob/master/src/SomaShade.js#L139

May 25 19:49:19 elysium somactrl[24059]: 2019-05-25T19:49:19.534Z soma:eebe28d042aa connected!

Instead it should "sleep" again and try to connect again.

mwitkow commented 5 years ago

OK, I think I worked around my problem. Unfortunately, I'm not an npm guru and I massively failed making this inside the real src/ codebase and then npm installing itt. Instead I edited the "live" npm source.

Here's the log output of tthe successful reconnect:

May 26 08:50:13 elysium somactrl[9449]: 2019-05-26T08:50:13.195Z soma:eebe28d042aa connected for 182.243 seconds
May 26 08:50:13 elysium somactrl[9449]: 2019-05-26T08:50:13.338Z soma:eebe28d042aa error while connecting below, will schedule a reconnect in 5s
May 26 08:50:13 elysium somactrl[9449]: 2019-05-26T08:50:13.344Z soma:eebe28d042aa Error: Command Disallowed (0xc)
May 26 08:50:13 elysium somactrl[9449]:     at NobleBindings.onLeConnComplete (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/bindings.js:220:13)
May 26 08:50:13 elysium somactrl[9449]:     at emitOne (events.js:116:13)
May 26 08:50:13 elysium somactrl[9449]:     at Hci.emit (events.js:211:7)
May 26 08:50:13 elysium somactrl[9449]:     at Hci.processCmdStatusEvent (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:674:12)
May 26 08:50:13 elysium somactrl[9449]:     at Hci.onSocketData (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:469:12)
May 26 08:50:13 elysium somactrl[9449]:     at emitOne (events.js:116:13)
May 26 08:50:13 elysium somactrl[9449]:     at BluetoothHciSocket.emit (events.js:211:7)
May 26 08:50:15 elysium somactrl[9449]: noble: unknown peripheral null connected!
May 26 08:50:18 elysium somactrl[9449]: 2019-05-26T08:50:18.367Z soma:eebe28d042aa starting reconnect after error
May 26 08:50:18 elysium somactrl[9449]: 2019-05-26T08:50:18.373Z soma:eebe28d042aa error while connecting below, will schedule a reconnect in 5s
May 26 08:50:18 elysium somactrl[9449]: 2019-05-26T08:50:18.374Z soma:eebe28d042aa Error: Command Disallowed (0xc)
May 26 08:50:18 elysium somactrl[9449]:     at NobleBindings.onLeConnComplete (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/bindings.js:220:13)
May 26 08:50:18 elysium somactrl[9449]:     at emitOne (events.js:116:13)
May 26 08:50:18 elysium somactrl[9449]:     at Hci.emit (events.js:211:7)
May 26 08:50:18 elysium somactrl[9449]:     at Hci.processCmdStatusEvent (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:674:12)
May 26 08:50:18 elysium somactrl[9449]:     at Hci.onSocketData (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:469:12)
May 26 08:50:18 elysium somactrl[9449]:     at emitOne (events.js:116:13)
May 26 08:50:18 elysium somactrl[9449]:     at BluetoothHciSocket.emit (events.js:211:7)
May 26 08:50:19 elysium somactrl[9449]: noble: unknown peripheral null connected!
May 26 08:50:23 elysium somactrl[9449]: 2019-05-26T08:50:23.375Z soma:eebe28d042aa starting reconnect after error
May 26 08:50:23 elysium somactrl[9449]: 2019-05-26T08:50:23.477Z soma:eebe28d042aa connected AOK2!

And this is the modified connect method of SomaShade.js

    connect() {
        this.peripheral.once('disconnect', () => {
            this.log('disconnected');
            this.identifyCharacteristic = null;
            this.movePercentCharacteristic = null;

            this.disconnectHandler = setTimeout(() => {
                this.log('flagging disconnect after timeout');
                this._connectionState = 'disconnected';
            }, 5*1000);

            this.log('connected for %s seconds', (Math.abs(new Date() - this.connectTime) / 1000));
            this.connect();
        });
        this.peripheral.connect((error) => {
            if (this.disconnectHandler) clearTimeout(this.disconnectHandler);
            if (error) {
        this.log("error while connecting below, will schedule a reconnect in 5s");
        this.log(error);
        this._connectionState = 'disconnected';
        if(this.reconnectHandler) clearTimeout(this.reconnectHandler);
        this.reconnectHandler = setTimeout(() => {
           this.log('starting reconnect after error');
           this.connect();
        }, 10 * 1000);
        return;
        }

(note the reconnectHandler needs to be declared) Apologies for this YOLO fix, I genuinely would have raised a PR if it wasn't for the fact that I failed miserably re building the source using NPM and reinstalling it.

mwitkow commented 5 years ago

Ok the above seems to have made the somactrl more robust, but definitely didn't address the underlying issue. It now spirals into a weird state where "Battery 100%" reads are no longer reported, and the system keeps on weirdly reconnecting:

May 26 12:58:32 elysium somactrl[13233]: 2019-05-26T12:58:32.173Z soma:ca065a773594 position 100
May 26 12:58:32 elysium somactrl[13233]: 2019-05-26T12:58:32.412Z soma:ca065a773594 battery at 100%
May 26 13:00:40 elysium somactrl[13233]: 2019-05-26T13:00:40.630Z soma:eebe28d042aa disconnected
May 26 13:00:40 elysium somactrl[13233]: 2019-05-26T13:00:40.631Z soma:eebe28d042aa connected for 182.063 seconds
May 26 13:00:40 elysium somactrl[13233]: 2019-05-26T13:00:40.818Z soma:eebe28d042aa connected AOK2!
May 26 13:00:42 elysium somactrl[13233]: 2019-05-26T13:00:42.378Z soma:eebe28d042aa position 100
May 26 13:00:42 elysium somactrl[13233]: 2019-05-26T13:00:42.617Z soma:eebe28d042aa battery at 100%
May 26 13:01:32 elysium somactrl[13233]: 2019-05-26T13:01:32.684Z soma:ca065a773594 disconnected
May 26 13:01:32 elysium somactrl[13233]: 2019-05-26T13:01:32.694Z soma:ca065a773594 connected for 182.171 seconds
May 26 13:01:32 elysium somactrl[13233]: 2019-05-26T13:01:32.801Z soma:ca065a773594 error while connecting below, will schedule a reconnect in 5s
May 26 13:01:32 elysium somactrl[13233]: 2019-05-26T13:01:32.807Z soma:ca065a773594 Error: Command Disallowed (0xc)
May 26 13:01:32 elysium somactrl[13233]:     at NobleBindings.onLeConnComplete (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/bindings.js:220:13)
May 26 13:01:32 elysium somactrl[13233]:     at emitOne (events.js:116:13)
May 26 13:01:32 elysium somactrl[13233]:     at Hci.emit (events.js:211:7)
May 26 13:01:32 elysium somactrl[13233]:     at Hci.processCmdStatusEvent (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:674:12)
May 26 13:01:32 elysium somactrl[13233]:     at Hci.onSocketData (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:469:12)
May 26 13:01:32 elysium somactrl[13233]:     at Hci.processCmdStatusEvent (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:674:12)
May 26 13:01:32 elysium somactrl[13233]:     at Hci.onSocketData (/usr/local/lib/node_modules/soma-ctrl/node_modules/noble/lib/hci-socket/hci.js:469:12)
May 26 13:01:32 elysium somactrl[13233]:     at emitOne (events.js:116:13)
May 26 13:01:32 elysium somactrl[13233]:     at BluetoothHciSocket.emit (events.js:211:7)
May 26 13:01:33 elysium somactrl[13233]: noble: unknown peripheral null connected!
May 26 13:01:42 elysium somactrl[13233]: 2019-05-26T13:01:42.840Z soma:ca065a773594 starting reconnect after error
May 26 13:01:42 elysium somactrl[13233]: 2019-05-26T13:01:42.930Z soma:ca065a773594 connected AOK2!
May 26 13:01:44 elysium somactrl[13233]: 2019-05-26T13:01:44.474Z soma:ca065a773594 position 100
May 26 13:01:44 elysium somactrl[13233]: 2019-05-26T13:01:44.713Z soma:ca065a773594 battery at 100%
May 26 13:03:03 elysium somactrl[13233]: noble: unknown peripheral null connected!
May 26 13:03:27 elysium somactrl[13233]: noble: unknown peripheral null connected!
May 26 13:03:42 elysium somactrl[13233]: 2019-05-26T13:03:42.889Z soma:eebe28d042aa disconnected
May 26 13:03:42 elysium somactrl[13233]: 2019-05-26T13:03:42.889Z soma:eebe28d042aa connected for 182.071 seconds
May 26 13:03:43 elysium somactrl[13233]: 2019-05-26T13:03:43.099Z soma:eebe28d042aa connected AOK2!
May 26 13:03:44 elysium somactrl[13233]: 2019-05-26T13:03:44.628Z soma:eebe28d042aa position 100
May 26 13:03:44 elysium somactrl[13233]: 2019-05-26T13:03:44.868Z soma:eebe28d042aa battery at 100%
May 26 13:04:44 elysium somactrl[13233]: 2019-05-26T13:04:44.985Z soma:ca065a773594 disconnected
May 26 13:04:44 elysium somactrl[13233]: 2019-05-26T13:04:44.987Z soma:ca065a773594 connected for 182.057 seconds
May 26 13:04:44 elysium somactrl[13233]: 2019-05-26T13:04:44.988Z soma:ca065a773594 disconnected
May 26 13:04:44 elysium somactrl[13233]: 2019-05-26T13:04:44.988Z soma:ca065a773594 connected for 182.058 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.203Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.203Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.218Z soma:ca065a773594 disconnected
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.218Z soma:ca065a773594 connected for 0.015 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.218Z soma:ca065a773594 disconnected
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.218Z soma:ca065a773594 connected for 0.015 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.315Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.316Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.330Z soma:ca065a773594 disconnected
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.330Z soma:ca065a773594 connected for 0.014 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.331Z soma:ca065a773594 disconnected
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.331Z soma:ca065a773594 connected for 0.015 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.461Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.462Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.476Z soma:ca065a773594 disconnected
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.477Z soma:ca065a773594 connected for 0.015 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.477Z soma:ca065a773594 disconnected
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.477Z soma:ca065a773594 connected for 0.015 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.585Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.586Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.600Z soma:ca065a773594 disconnected
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.600Z soma:ca065a773594 connected for 0.014 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.601Z soma:ca065a773594 disconnected
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.601Z soma:ca065a773594 connected for 0.015 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.716Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.717Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.731Z soma:ca065a773594 disconnected
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.732Z soma:ca065a773594 connected for 0.015 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.732Z soma:ca065a773594 disconnected
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.732Z soma:ca065a773594 connected for 0.015 seconds
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.848Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: 2019-05-26T13:04:45.869Z soma:ca065a773594 connected AOK2!
May 26 13:04:45 elysium somactrl[13233]: (node:13233) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 servicesDiscover listeners added. Use emitter.setMaxListeners() to increase limit
mwitkow commented 5 years ago

Right, in order to work around this, I am now running somactrl from SystemD file which restarts the daemon every 5 minutes which seems to be longer than any of these issues persist. Yay workarounds

$ cat /etc/systemd/system/somactlr.service

Contents

[Unit]
Description=SOMACtl Service
After=network-online.target mosquitto.service

[Service]
User=root
# Work around an issue https://github.com/andersonshatch/soma-ctrl/issues/59 where it gets wedged.
# Restart it every 5min
Type=simple
Restart=always
RuntimeMaxSec=300

ExecStart=/usr/local/bin/somactrl -t -e 2 -d -l 3000 --url http://127.0.0.1 -u homeassistant -p PutYourPasswordHere

[Install]
WantedBy=multi-user.target
andersonshatch commented 5 years ago

It seems that this could be related to the fact that Rapsberry Pi bluetooth dongle doesn't allow scanning and sending a command at the same time noble/noble#658

Scanning and sending a command at once shouldn't be an issue since version 1.3.0, but if you have something else using the bluetooth hardware, then it's out of my control. It's been a known issue for noble that other processes using the bluetooth hardware, even other noble processes can see each other's state.

It seems that errors happen on certain attempts to reconnect. It seems that the problem with the coma-ctrl code is the fact that after an error occurs in connect, the code continues trying to read from a device. See here: https://github.com/andersonshatch/soma-ctrl/blob/master/src/SomaShade.js#L136 prints the error

May 25 19:49:19 elysium somactrl[24059]: 2019-05-25T19:49:19.533Z soma:eebe28d042aa Error: Command Disallowed (0xc)

and then the next line happily printts "connected": https://github.com/andersonshatch/soma-ctrl/blob/master/src/SomaShade.js#L139

May 25 19:49:19 elysium somactrl[24059]: 2019-05-25T19:49:19.534Z soma:eebe28d042aa connected!

Instead it should "sleep" again and try to connect again.

I'm in agreement about this... an error from the connect call should abort the rest, and schedule a re-connect; it definitely reads wrong / as an oversight.

That said, I haven't personally experienced any issues like those you've described with 4 soma devices and no other processes using the bluetooth hardware, the somactrl process remains stable for multiple weeks at a time in my home, so... I'm a little hesitant to change anything, especially when I can't reproduce the initial failure -- though I suppose I could try to connect with more bluetooth devices concurrently to reproduce the issue.

I have one Xiaomi thermometer, and previously tried using a homebridge plugin to expose it over MQTT but it's unfortunately not within range of my somactrl pi. From what I recall, that homebridge plugin used noble and conducted regular scans.

I guess, let's start with... If you disable anything else using the BT hardware, do these issues persist?

mwitkow commented 5 years ago

Yes, I have a https://www.home-assistant.io/components/mitemp_bt/ running in the background collecting stuff. So yes, that's where the cross-talk comes through. I put the somactrl on my NAS (with an intel bluetooth) chipset, and it does run more stabily... however it causes kernel panics due to a bluetooth module bug (https://www.spinics.net/lists/linux-bluetooth/msg80212.html)

I just bought a CSR8510 chipset BLE 4.0 USB dongle (https://www.amazon.co.uk/dp/B073CF6993/) to move the somactrl to. I assume somactrl will work with NOBLE_HCI_DEVICE=1 to set it to bind to a different device (see https://github.com/noble/noble#multiple-adapters)? I'll report back to see if that worked and then we can close this one with a workaround "don't do multiple things with a bluetooth dongle" ;).

mwitkow commented 5 years ago

@andersonshatch Yup, the CSR8510 separate dongle helped a lot (no issues in the last 2 days).

Maybe it's worthwhile updating the README to state that running this on a separate bluetooth device from other services is useful as it may cause interference?

andersonshatch commented 5 years ago

Glad that helped! Addressed any errors in the connect by re-connecting, and added a note to the README in #65