openhab / openhab-addons

Add-ons for openHAB
https://www.openhab.org/
Eclipse Public License 2.0
1.86k stars 3.58k forks source link

[deconz] events from deconz are received but not propagated in openhab, error message Get full state failed #13132

Closed msandres13 closed 1 year ago

msandres13 commented 2 years ago

Expected Behavior

zigbee switch issues event on deconz, deconz binding receives that event and propagates it to openhab reliable.

Current Behavior

My containerized openhab instance is using the deconz plugin and a separate Raspberrypi running deconz natively. A rule which receives the event from zigbee switches generates a Telegram message using a bot. This works consistently sometimes for days, sometimes for hours after the restart of the container. Then it stops working randomly until the deconz plugin is paused and then started again. Used dockerhub container version is image: "openhab/openhab:3.3.0" deconz uses this package:

apt show deconz
Package: deconz
Version: 2.17.01-debian-buster-stable
Priority: extra
Section: non-free / misc
Maintainer: Manuel Pietschmann <manuel.pietschmann@dresden-elektronik.de>
Installed-Size: 41.6 MB
Depends: libqt5core5a, libqt5network5, libqt5widgets5, libqt5gui5, libqt5serialport5, libqt5websockets5, libqt5sql5, libqt5qml5, libcap2-bin, sqlite3, lsof, curl
Homepage: http://www.dresden-elektronik.de
Download-Size: 6,399 kB
APT-Manual-Installed: yes
APT-Sources: http://phoscon.de/apt/deconz bullseye/main arm64 Packages
Description: A generic ZigBee monitoring and control tool
  The deCONZ application allows to monitor and control arbitrary ZigBee PRO based devices.

In the Telegram channel I see notification stopped at 11:33PM 14th of Jul. https://nextcloud.andres.one/s/mgCA8zxtMCq2SgL (Picture) link to the log file https://nextcloud.andres.one/s/CeemYpy6YCsBNZ3 (log file) Log file shows Problems here

2022-07-14 05:13:34.928 [DEBUG] [internal.handler.DeconzBridgeHandler] - Get full state failed
    at org.openhab.binding.deconz.internal.netutils.AsyncHttpClient$1.onComplete(AsyncHttpClient.java:109) [bundleFile:?]

at

 2022-07-15 10:25:16.415 [DEBUG] [internal.handler.DeconzBridgeHandler] - Start initializing bridge deconz:deconz:a3e5e5988e

I pressed pause on the deconz thing and pressed play again. deconz did reinitialize and the event propagation started working again.

Possible Solution

no idea.

Steps to Reproduce (for Bugs)

since the Problem is intermittent I dont know how to willing force the problem. For me it happens usually several times a week if I do not restart the openhab instance. I can pickup a change to try with a code fix or catch patch.

Context

I want notifications on Telegram reliable if a Door or Window in my House is opened or closed.

Your Environment

msandres13 commented 2 years ago

2022_6_31_10_21_no_deconz_events.tar.gz 2022_6_31_10_21_no_deconz_events_recovery.tar.gz Data from dead event handling, occurrence, last event seen on 30th of July 11:54am First test switch event seen again after pausing deconz plugin and pressed play again. 31st of July 8:23am

msandres13 commented 2 years ago

Maybe this is related to #9517 I recognized that the server which run openhab does not use ntp and does not run in the same TZ. I corrected that. Lets see if the problem reoccurs.

Owlbertz commented 2 years ago

I am encountering a similar issue with my sensors. This occurs for contact sensors (openclose), motion sensors (of various vendors) and remotes. Toggling switches/light via the Deconz bridge works well, but openHAB stops to receive updates from the sensors after a while. Restarting openHAB solves this issue for some (varying) time.

msandres13 commented 2 years ago

@Owlbertz can you try the nexttime to pause and start again the deconz thing? That always fixes the problem in my case.

msandres13 commented 2 years ago

Today it failed again after a few hours. I see this error message. Looks like the channel is not restarted after that error.

`2022-08-05 11:44:36.750 [TRACE] [internal.handler.DeconzBridgeHandler] - deconz:deconz:a3e5e5988e starts refreshing the fullStateCache
2022-08-05 11:44:36.753 [DEBUG] [internal.handler.DeconzBridgeHandler] - Get full state failed
java.util.concurrent.CompletionException: java.io.EOFException: HttpConnectionOverHTTP@1551cb03::SocketChannelEndPoint@30926e0b{l=/172.31.0.4:50504,r=/192.168.2.59:9080,ISHUT,fill=-,flush=-,to=1/0}{io=0/0,kio=0,kro=1}->HttpConnectionOverHTTP@1551cb03(l:/172.31.0.4:50504 <-> r:/192.168.2.59:9080,closed=false)=>HttpChannelOverHTTP@4e2c480(exchange=HttpExchange@d33cb4{req=HttpRequest[GET /api/0F4E0083FE HTTP/1.1]@1b41f67e[TERMINATED/null] res=HttpResponse[null 0 null]@3880212b[PENDING/null]})[send=HttpSenderOverHTTP@2a54247f(req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator@337b8b1d{s=START}],recv=HttpReceiverOverHTTP@279d790a(rsp=IDLE,failure=null)[HttpParser{s=CLOSED,0 of -1}]]
    at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:331) ~[?:?]
    at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:346) ~[?:?]
    at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:632) ~[?:?]
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506) [?:?]
    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:2088) [?:?]
    at org.openhab.binding.deconz.internal.netutils.AsyncHttpClient$1.onComplete(AsyncHttpClient.java:109) [bundleFile:?]
    at org.eclipse.jetty.client.ResponseNotifier.notifyComplete(ResponseNotifier.java:218) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.ResponseNotifier.notifyComplete(ResponseNotifier.java:210) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.HttpReceiver.terminateResponse(HttpReceiver.java:481) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.HttpReceiver.terminateResponse(HttpReceiver.java:461) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.HttpReceiver.abort(HttpReceiver.java:557) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.HttpReceiver.responseFailure(HttpReceiver.java:453) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.failAndClose(HttpReceiverOverHTTP.java:413) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.earlyEOF(HttpReceiverOverHTTP.java:385) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1620) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.shutdown(HttpReceiverOverHTTP.java:269) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.process(HttpReceiverOverHTTP.java:185) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.http.HttpReceiverOverHTTP.receive(HttpReceiverOverHTTP.java:80) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.http.HttpChannelOverHTTP.receive(HttpChannelOverHTTP.java:131) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.client.http.HttpConnectionOverHTTP.onFillable(HttpConnectionOverHTTP.java:172) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883) [bundleFile:9.4.46.v20220331]
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034) [bundleFile:9.4.46.v20220331]
    at java.lang.Thread.run(Thread.java:829) [?:?]`
msandres13 commented 2 years ago

openhab.log events.log Last successful sensor update was on 11:33

msandres13 commented 2 years ago

Found a workaround to make deconz reliably work. I wrote a script which restarts the deconz plugin when the error message shows up in the openhab.log. You need to start it with screen or better with systemd to it always is running.

#!/usr/bin/env bash
set -x
set -e
export COMPOSE_INTERACTIVE_NO_CLI=1
while :
  do read line
     echo "==> found error pattern ${line}"
     docker-compose exec -T openhab /openhab/runtime/bin/client -p habopen bundle:restart org.openhab.binding.deconz
done < <(tail -F /srv/openhab3/openhab_userdata/logs/openhab.log | grep --line-buffered -F "Get full state failed")

Since a plugin restart with every "Get full state failed" message fixes the problem, we need to understand what needs to be done in the code where the log is written in order to recover the error. I am not familiar with the APIs used here so it would be great if somebody could propose a fix.

msandres13 commented 2 years ago

image

            } else if (t instanceof SocketTimeoutException || t instanceof TimeoutException
                    || t instanceof CompletionException) {
                logger.debug("Get full state failed", t);
            } else {

So instead of just writing a log the bridge object probably needs an reset