ioBroker / ioBroker.zigbee

Zigbee communcation with Hue, Xiaomi, Lighttify... via TI CC2xxx USB stick
MIT License
313 stars 190 forks source link

Promise crashes if timer runs out #1231

Closed mzander closed 3 years ago

mzander commented 3 years ago

Hi,

I am new to the iobroker.zigbee adapter and have some trouble using it. I am using a Tasmota Sonoff ZHA Bride using the TCP Stack.

This occurs randomly and let the complete adapter crashing.

zigbee.0 2021-10-04 14:38:09.715 error {"address":52874,"clusterId":32817,"sequence":87} after 30000ms
zigbee.0 2021-10-04 14:38:09.715 error Error: {"address":52874,"clusterId":32817,"sequence":87} after 30000ms at Timeout._onTimeout (/opt/iobroker/node_modules/zigbee-herdsman/dist/utils/waitress.js:37:35) at listOnTimeout (internal/timers.js:554:17) at processTimers (internal/timers.js:497:7)
zigbee.0 2021-10-04 14:38:09.714 error unhandled promise rejection: {"address":52874,"clusterId":32817,"sequence":87} after 30000ms

Any tipps whats wrong with my installation?

Thanks

asgothian commented 3 years ago

Please post

Note that Sonoff ZHA is experimental support - some functions may not work and some may cause issues like yours.

A.

mzander commented 3 years ago

Sorry, of course... I forgot....

Adapter: v1.5.6 (the tarball of your repo)

Tasmota: 9.5.0

Zigbee Radio FW Version: 6.7.8

Context: I seems to appear randomly. I am not sure if that is related but since I saw that error the Network Map of the Zigbee Adapter doesnt work anymore. All devices are red and it shows no connections to each other. It also happens if I do not do anything actively. When I log back into iobroker the crashes sometimes lead to restart loop protection that results in an disabled adapter.

Log: more will follow as soon as it spits it out again....

zigbee.0 2021-10-04 14:38:10.233 warn Terminated (UNCAUGHT_EXCEPTION): Without reason
zigbee.0 2021-10-04 14:38:10.232 info terminating
zigbee.0 2021-10-04 14:38:09.735 info Zigbee: disabling joining new devices.
zigbee.0 2021-10-04 14:38:09.729 info cleaned everything up...
zigbee.0 2021-10-04 14:38:09.715 error {"address":52874,"clusterId":32817,"sequence":87} after 30000ms
zigbee.0 2021-10-04 14:38:09.715 error Error: {"address":52874,"clusterId":32817,"sequence":87} after 30000ms at Timeout._onTimeout (/opt/iobroker/node_modules/zigbee-herdsman/dist/utils/waitress.js:37:35) at listOnTimeout (internal/timers.js:554:17) at processTimers (internal/timers.js:497:7)
zigbee.0 2021-10-04 14:38:09.714 error unhandled promise rejection: {"address":52874,"clusterId":32817,"sequence":87} after 30000ms
mzander commented 3 years ago

I was able to reproduce it by staring a new pair process

zigbee.0 | 2021-10-04 16:38:34.227 | info | Currently 26 devices are joined:
-- | -- | -- | --
zigbee.0 | 2021-10-04 16:38:34.199 | info | --> transmitPower : normal
zigbee.0 | 2021-10-04 16:38:34.197 | info | Unable to disable LED, unsupported function.
zigbee.0 | 2021-10-04 16:38:34.194 | info | Coordinator firmware version: {"type":"EZSP v8","meta":{"product":8,"majorrel":"6","minorrel":"7","maintrel":"8 ","revision":"6.7.8.0 build 373"}}
zigbee.0 | 2021-10-04 16:38:28.488 | info | Installed Version: asgothian/ioBroker.zigbee#LatestHerdsman
zigbee.0 | 2021-10-04 16:38:28.164 | info | Starting Zigbee npm ...
zigbee.0 | 2021-10-04 16:38:28.044 | info | starting. Version 1.5.6 in /opt/iobroker/node_modules/iobroker.zigbee, node: v12.22.5, js-controller: 3.3.15
host.iobroker | 2021-10-04 16:38:25.333 | info | instance system.adapter.zigbee.0 started with pid 25529
host.iobroker | 2021-10-04 16:37:55.250 | info | Restart adapter system.adapter.zigbee.0 because enabled
host.iobroker | 2021-10-04 16:37:55.244 | error | instance system.adapter.zigbee.0 terminated with code 6 (UNCAUGHT_EXCEPTION)
zigbee.0 | 2021-10-04 16:37:54.528 | warn | Terminated (UNCAUGHT_EXCEPTION): Without reason
zigbee.0 | 2021-10-04 16:37:54.527 | info | terminating
zigbee.0 | 2021-10-04 16:37:54.029 | info | Zigbee: disabling joining new devices.
zigbee.0 | 2021-10-04 16:37:54.025 | info | cleaned everything up...
zigbee.0 | 2021-10-04 16:37:54.020 | error | {"frameId":85,"sequence":212} after 10000ms
zigbee.0 | 2021-10-04 16:37:54.019 | error | Error: {"frameId":85,"sequence":212} after 10000ms at Timeout._onTimeout (/opt/iobroker/node_modules/zigbee-herdsman/dist/utils/waitress.js:37:35) at listOnTimeout (internal/timers.js:554:17) at processTimers (internal/timers.js:497:7)
zigbee.0 | 2021-10-04 16:37:54.019 | error | unhandled promise rejection: {"frameId":85,"sequence":212} after 10000ms
zigbee.0 | 2021-10-04 16:37:54.018 | error | Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch().
zigbee.0 | 2021-10-04 16:37:29.912 | info | Zigbee: stop joining
zigbee.0 | 2021-10-04 16:36:28.758 | info | Zigbee: allowing new devices to join.
zigbee.0 | 2021-10-04 16:33:34.737 | error | Send command to 0x0017880106114123 failed with no error code (Timeout - 28881 - 11 - 20 - 6 - 11 after 10000ms)
zigbee.0 | 2021-10-04 16:30:51.069 | info | Zigbee started

btw: I havent seen the send command error before...

mzander commented 3 years ago

And some more logs ;-) I really did nothing - I just waited:

host.iobroker | 2021-10-04 16:46:15.797 | info | Restart adapter system.adapter.zigbee.0 because enabled
-- | -- | -- | --
host.iobroker | 2021-10-04 16:46:15.796 | error | instance system.adapter.zigbee.0 terminated with code 6 (UNCAUGHT_EXCEPTION)
host.iobroker | 2021-10-04 16:46:15.790 | error | Caught by controller[1]: (node:25529) PromiseRejectionHandledWarning: Promise rejection was handled asynchronously (rejection id: 13)
zigbee.0 | 2021-10-04 16:46:15.166 | warn | Terminated (UNCAUGHT_EXCEPTION): Without reason
zigbee.0 | 2021-10-04 16:46:15.164 | info | terminating
zigbee.0 | 2021-10-04 16:46:14.668 | info | Zigbee: disabling joining new devices.
zigbee.0 | 2021-10-04 16:46:14.661 | info | cleaned everything up...
zigbee.0 | 2021-10-04 16:46:14.647 | error | {"address":53081,"clusterId":32817,"sequence":72} after 30000ms
zigbee.0 | 2021-10-04 16:46:14.647 | error | Error: {"address":53081,"clusterId":32817,"sequence":72} after 30000ms at Timeout._onTimeout (/opt/iobroker/node_modules/zigbee-herdsman/dist/utils/waitress.js:37:35) at listOnTimeout (internal/timers.js:554:17) at processTimers (internal/timers.js:497:7)
zigbee.0 | 2021-10-04 16:46:14.646 | error | unhandled promise rejection: {"address":53081,"clusterId":32817,"sequence":72} after 30000ms
zigbee.0 | 2021-10-04 16:46:14.643 | error | Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch().
ring.0 | 2021-10-04 16:42:17.498 | info | Ding Dong for Id 15597156 (motion, ringing)
zigbee.0 | 2021-10-04 16:38:34.355 | info | Zigbee started
zigbee.0 | 2021-10-04 16:38:34.354 | info | 0x0017880106530f93 (addr 8048): 915005733701 - Philips Hue White and color ambiance Play Lightbar (Router)
zigbee.0 | 2021-10-04 16:38:34.352 | info | 0x0017880106114123 (addr 28881): 915005733701 - Philips Hue White and color ambiance Play Lightbar (Router)
zigbee.0 | 2021-10-04 16:38:34.351 | info | 0x00178801065316d0 (addr 53081): 915005733701 - Philips Hue White and color ambiance Play Lightbar (Router)

Hope that helps....

asgothian commented 3 years ago

I am afraid that this is very hard to track down without any additional context. It seems to come from deep within the ZHA adapter code.

A.

mzander commented 3 years ago

I know, I already dug a bit into the adapters code but I have no idea where this error come from. I already tried to get more context information but did not had luck until now.

But doesn't the fact that the adapter crashes completely mean that there is made use of a promise which has no error catching?

asgothian commented 3 years ago

That is correct. Unfortunately, that does not necessarily mean that it is within the adapter code. Debugging it is made more complex by the fact that the zigbee-herdsman is written in Typescript, not JS.

The Herdsman itself is highly asynchronous using async/await. A simple break in this chain and a missed try/catch usually causes the issue. Unfortunately, the actual error which is being raised does not include enough information to pinpoint its origin. The error is raised (planned) by the waitress function to signal that a command could not be sent, but tracing the chain of promises back through the levels is far from trivial.

A.

mzander commented 3 years ago

I removed all zigbee devices from the adapter / network, changed the panID and channel. Repaired everything and all works pretty nice now. Also the network map is up to date again.

During pairing I could reproduce the problem but was able to fix it. If an interview was started but could not be finished until the pairing process stops the failed device was shown in the list but without any information. At that point my zigbee network started to get very slow (commands took 20 seconds) and I got the waitress error in the logs. The network map was also broken then. As soon I removed the device (which took quite a while) the problem was gone and the performance was great again.