stefan-kaestle / openhab2-addons

Add-ons for openHAB 2.x
Eclipse Public License 2.0
16 stars 1 forks source link

After 24h no Temprature update with Climate Control #62

Closed jkarsten closed 3 years ago

jkarsten commented 3 years ago

Hello,

i have the Beta 5 release addon running. My BSH Gateway thing says it is online and my five ClimateControl things are online. After i add the BSH Thing all works fine. The setpoint_temprature and temprature Item become a automatics updates. After i change the value in the Bosch App the setpoint item in OH change his value under one second. After 24h i can change a value and wait for temprature updates but nothing happen.

When i pause the BSH Thing and start it again i can change the values in the Bosch app and all works fine for 24 hours. I made this yesterday and today. In my Log is no error. Yesterday i made the resart at 13:00 today at 14:00. I thing tomorrow at 14:00 i will get no updates.

I am from north of Hamburg. I can describe this better in German when you want.

Jörg

coeing commented 3 years ago

Just had another idea what could cause the issue: The re-subscribe request is send within the thread of the previous long poll request. This might be something the http client does not like. So now I am scheduling the re-subscribe request and execute it in a new thread.

I do not want to be too enthusiastic about it, but this might be the reason :) The new .jar contains the change, so feel free to test it! https://github.com/stefan-kaestle/openhab2-addons/releases/tag/v1.0-bugfix.62

GerdZanker commented 3 years ago

Here the "best of log" from last occurred issue. Compared to the last Timeout exception my bridge was still shown as "online".

2021-01-21 19:58:19.539 [DEBUG] [.internal.devices.bridge.LongPolling] - Sending long poll request
2021-01-21 19:58:19.539 [DEBUG] [jetty.client.TimeoutCompleteListener] - Cancelled (true) timeout for HttpRequest[POST /remote/json-rpc HTTP/1.1]@bc6a79 on org.eclipse.jetty.client.TimeoutCompleteListener@5e2482
2021-01-21 19:58:19.540 [TRACE] [ernal.devices.bridge.BoschHttpClient] - Create request for http client BoschHttpClient@12aac49{STARTED}
2021-01-21 19:58:19.540 [DEBUG] [rg.eclipse.jetty.client.HttpReceiver] - Parsed false, remaining 0 HttpParser{s=START,0 of -1}
2021-01-21 19:58:19.541 [DEBUG] [rg.eclipse.jetty.client.HttpReceiver] - Parsed false, remaining 0 HttpParser{s=START,0 of -1}
2021-01-21 19:58:19.541 [TRACE] [ernal.devices.bridge.BoschHttpClient] - create request for https://192.168.178.31:8444/remote/json-rpc and content {"jsonrpc":"2.0","method":"RE/longPoll","params":["ef5lmcknf-38","20"]}
2021-01-21 19:58:19.543 [DEBUG] [rg.eclipse.jetty.client.HttpReceiver] - Read 0 bytes ...
2021-01-21 19:58:19.543 [DEBUG] [clipse.jetty.client.HttpConversation] - Exchanges in conversation 1, override=null, listeners=[org.openhab.binding.boschshc.internal.devices.bridge.LongPolling$1@1fc4764]
2021-01-21 19:58:19.545 [DEBUG] [eclipse.jetty.client.HttpDestination] - org.eclipse.jetty.client.HttpDestination$TimeoutTask@adc5b scheduled timeout in 29998 ms
2021-01-21 19:58:19.546 [DEBUG] [eclipse.jetty.client.HttpDestination] - Queued HttpRequest[POST /remote/json-rpc HTTP/1.1]@e61000 for HttpDestination[https://192.168.178.31:8444]@b0d71,queue=1,pool=DuplexConnectionPool@198183b[c=1/64,a=0,i=1]
2021-01-21 19:58:19.548 [DEBUG] [.jetty.client.AbstractConnectionPool] - Connection active ...
2021-01-21 19:58:19.550 [DEBUG] [eclipse.jetty.client.HttpDestination] - Processing exchange HttpExchange@c0d19f ...
2021-01-21 19:58:19.552 [DEBUG] [org.eclipse.jetty.client.HttpChannel] - HttpExchange@c0d19f req=PENDING/null@null res=PENDING/null@null associated true to HttpChannelOverHTTP@71dbcd(exchange=HttpExchange@c0d19f req=PENDING/null@null res=PENDING/null@null)[send=HttpSenderOverHTTP@df6b3a(req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator@13ddb76{s=START}],recv=HttpReceiverOverHTTP@793533(rsp=IDLE,failure=null)[HttpParser{s=START,0 of -1}]]
2021-01-21 19:58:19.554 [DEBUG] [jetty.client.TimeoutCompleteListener] - Scheduled timeout in 29989 ms for HttpRequest[POST /remote/json-rpc HTTP/1.1]@e61000 on org.eclipse.jetty.client.TimeoutCompleteListener@5e2482

I sent the full log to @coeing for deeper analysis, but I think the #62 code could improve the situation and I will upgrade to this test version. Anyway we could always get exceptions (due to disconnected Ethernet cables, etc.) and a reconnect in the BoschSHCBridgeHandler is a second fallback to handle connection issues.

MarkJonas commented 3 years ago

@coeing I installed the updated and keep the fingers crossed. :smiley_cat: Thank you very much for stamina in trying to fix this issue.

I am just confused that the updated bugfix.62 version has the identical tag and release like the previous one. Also the bundle has the same version. I hope I installed it properly but I think so because at least the bundle index was incremented.

Before: 239 │ Active │ 80 │ 3.1.0.202101221403 │ org.openhab.binding.boschshc

After: 240 │ Active │ 80 │ 3.1.0.202101221403 │ org.openhab.binding.boschshc

coeing commented 3 years ago

@coeing I installed the updated and keep the fingers crossed. 😺 Thank you very much for stamina in trying to fix this issue.

I am just confused that the updated bugfix.62 version has the identical tag and release like the previous one. Also the bundle has the same version. I hope I installed it properly but I think so because at least the bundle index was incremented.

Before: 239 │ Active │ 80 │ 3.1.0.202101221403 │ org.openhab.binding.boschshc

After: 240 │ Active │ 80 │ 3.1.0.202101221403 │ org.openhab.binding.boschshc

Those are both the current version, the last part of the version number is the build date/time, so today at 14:03 (one hour off due to timezone).

GerdZanker commented 3 years ago

Hello @coeing with your test version 3.1.0.202101221403 I got not exception or timeout and my bridge is still working after 24h! Here my log snippet with a successful re-subscription.

2021-01-23 19:26:55.381 [DEBUG] [.internal.devices.bridge.LongPolling] - Long poll response: {"jsonrpc":"2.0","error":{"code":-32001,"message":"No subscription with id: ef5lmcknf-60"}}
2021-01-23 19:26:55.382 [WARN ] [.internal.devices.bridge.LongPolling] - Long poll response contained no results: {"jsonrpc":"2.0","error":{"code":-32001,"message":"No subscription with id: ef5lmcknf-60"}}
2021-01-23 19:26:55.392 [WARN ] [.internal.devices.bridge.LongPolling] - Got long poll error: No subscription with id: ef5lmcknf-60 (code: -32001)
2021-01-23 19:26:55.393 [WARN ] [.internal.devices.bridge.LongPolling] - Subscription ef5lmcknf-60 became invalid, subscribing again
2021-01-23 19:26:55.400 [DEBUG] [jetty.client.TimeoutCompleteListener] - Cancelled (true) timeout for HttpRequest[POST /remote/json-rpc HTTP/1.1]@115c643 on org.eclipse.jetty.client.TimeoutCompleteListener@c7e021
2021-01-23 19:26:55.400 [DEBUG] [.internal.devices.bridge.LongPolling] - Subscribe: Sending request: {"jsonrpc":"2.0","method":"RE/subscribe","params":["com/bosch/sh/remote/*",null]} - using httpClient BoschHttpClient@116ff05{STARTED}
2021-01-23 19:26:55.402 [TRACE] [ernal.devices.bridge.BoschHttpClient] - create request for https://192.168.178.31:8444/remote/json-rpc and content {"jsonrpc":"2.0","method":"RE/subscribe","params":["com/bosch/sh/remote/*",null]}
2021-01-23 19:26:55.404 [TRACE] [ernal.devices.bridge.BoschHttpClient] - Send request: HttpRequest[POST /remote/json-rpc HTTP/1.1]@2d683b
2021-01-23 19:26:55.417 [DEBUG] [jetty.client.TimeoutCompleteListener] - Scheduled timeout in 9987 ms for HttpRequest[POST /remote/json-rpc HTTP/1.1]@2d683b on org.eclipse.jetty.client.TimeoutCompleteListener@c7e021
2021-01-23 19:26:55.571 [DEBUG] [ernal.devices.bridge.BoschHttpClient] - Received response: {"result":"ef5lmcknf-69","jsonrpc":"2.0"}
 - status: 200
2021-01-23 19:26:55.571 [DEBUG] [jetty.client.TimeoutCompleteListener] - Cancelled (true) timeout for HttpRequest[POST /remote/json-rpc HTTP/1.1]@2d683b on org.eclipse.jetty.client.TimeoutCompleteListener@c7e021
2021-01-23 19:26:55.572 [DEBUG] [.internal.devices.bridge.LongPolling] - Subscribe: Got subscription ID: ef5lmcknf-69 2.0
2021-01-23 19:26:55.573 [DEBUG] [.internal.devices.bridge.LongPolling] - Sending long poll request
2021-01-23 19:26:55.573 [TRACE] [ernal.devices.bridge.BoschHttpClient] - Create request for http client BoschHttpClient@116ff05{STARTED}
2021-01-23 19:26:55.575 [TRACE] [ernal.devices.bridge.BoschHttpClient] - create request for https://192.168.178.31:8444/remote/json-rpc and content {"jsonrpc":"2.0","method":"RE/longPoll","params":["ef5lmcknf-69","20"]}

I'm looking forward for more feedback, but I think you @coeing found a solution.

tuxianerDE commented 3 years ago

16 hrs and counting. So far it does look still good and the data are still coming in.

jkarsten commented 3 years ago

Hello, it is online since 23.01.2021 17:00. Looks good. Jörg

MarkJonas commented 3 years ago

@coeing I also think you nailed it. It is now running for more than 72h hours without a freeze.

coeing commented 3 years ago

Thanks everybody for your tests! Seems like we found the reason for the bug now :) I will prepare a pull request which will bring the fix into the next version. We will probably create a version 1.1 because of the severity of this bug.

tuxianerDE commented 3 years ago

Sorry to be a party bummer, but last night it stopped working again. I am waiting to get the logs, sorry for the disappointment!

image

coeing commented 3 years ago

Sorry to be a party bummer, but last night it stopped working again. I am waiting to get the logs, sorry for the disappointment!

image

It's probably another issue, maybe due to an update? Please check #74

tuxianerDE commented 3 years ago

damn it that silent update make it break. however all other automation was still working like sending commands to the shutters etc. :-S

coeing commented 3 years ago

damn it that silent update make it break. however all other automation was still working like sending commands to the shutters etc. :-S

Your logs will be useful anyway to fix the new bug #74, so please feel free to attach them to the other ticket :)

MarkJonas commented 3 years ago

@coeing I have to chime in. It also stopped working for me. I'll send you the logs by email. Now it seems to be a 5 days (120 hours) problem. My last restart was January 22nd, ~19:30 o'clock. It stopped January 26th, ~19:30 o'clock.

freeze-after-5-days

GerdZanker commented 3 years ago

Hi @MarkJonas, is your restart related with the Bosch SmartHomeController update rolled out in the last days?

MarkJonas commented 3 years ago

@GerdZanker I was just thinking about that! Yes, I think yesterday evening the SHC installed the an update. So the 5 days might be a coincidence. But still it should not freeze due to a reboot of the SHC, right?

EDIT: The error message of the SHC Thing is:

COMMUNICATION_ERROR
Long polling failed and could not be restarted: Error on subscribe request (Http client: BoschHttpClient@11f944e{STARTED})
GerdZanker commented 3 years ago

But still it should not freeze due to a reboot of the SHC, right?

Right, but root cause could be a different one and another code fix could be necessary. Therefore @coeing tracked this COMMUNICATION_ERROR after reboot as a standalong issue in #74.

(It also feels a bit better if we can close on bug and tackle the next one)

coeing commented 3 years ago

Should be fixed by #76 We will create a pre-release version 1.1 soon, so everybody can test again if it works as planned :)