openhab / openhab-cloud

Cloud companion for openHAB instances
Eclipse Public License 2.0
312 stars 162 forks source link

Connection error recovery failed #134

Open paulianttila opened 6 years ago

paulianttila commented 6 years ago

Connection to cloud failed yesterday (22.11.2017) and remote connection didn't work even binding claims that connection is open.

I stopped and started Cloud Connector Bundle manually 2017-11-23 17:13:XX and now remote connection works again.

All cloud related logs:

2017-11-22 16:30:14.644 [ERROR] [io.openhabcloud.internal.CloudClient] - Socket.IO error: {}
io.socket.engineio.client.EngineIOException: websocket error
    at io.socket.engineio.client.Transport.onError(Transport.java:71) [244:org.openhab.io.openhabcloud:2.2.0.201710181101]
    at io.socket.engineio.client.transports.WebSocket.access$500(WebSocket.java:28) [244:org.openhab.io.openhabcloud:2.2.0.201710181101]
    at io.socket.engineio.client.transports.WebSocket$2$5.run(WebSocket.java:150) [244:org.openhab.io.openhabcloud:2.2.0.201710181101]
    at io.socket.thread.EventThread$2.run(EventThread.java:80) [244:org.openhab.io.openhabcloud:2.2.0.201710181101]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
    at java.lang.Thread.run(Thread.java:748) [?:?]
Caused by: java.io.EOFException
    at okio.RealBufferedSource.require(RealBufferedSource.java:59) ~[?:?]
    at okio.RealBufferedSource.readByte(RealBufferedSource.java:72) ~[?:?]
    at okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:115) ~[?:?]
    at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:104) ~[?:?]
    at okhttp3.internal.ws.RealWebSocket.readMessage(RealWebSocket.java:97) ~[?:?]
    at okhttp3.ws.WebSocketCall.createWebSocket(WebSocketCall.java:152) ~[?:?]
    at okhttp3.ws.WebSocketCall.access$000(WebSocketCall.java:41) ~[?:?]
    at okhttp3.ws.WebSocketCall$1.onResponse(WebSocketCall.java:97) ~[?:?]
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:126) ~[?:?]
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[?:?]
    ... 3 more
2017-11-22 16:30:14.658 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = ********-****-****-****-************, base URL = http://localhost:8080)
2017-11-22 16:30:36.990 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = ********-****-****-****-************, base URL = http://localhost:8080)
2017-11-22 16:31:03.425 [ERROR] [io.openhabcloud.internal.CloudClient] - Socket.IO error: {}
io.socket.engineio.client.EngineIOException: xhr post error
    at io.socket.engineio.client.Transport.onError(Transport.java:71) [244:org.openhab.io.openhabcloud:2.2.0.201710181101]
    at io.socket.engineio.client.transports.PollingXHR.access$000(PollingXHR.java:19) [244:org.openhab.io.openhabcloud:2.2.0.201710181101]
    at io.socket.engineio.client.transports.PollingXHR$4$1.run(PollingXHR.java:88) [244:org.openhab.io.openhabcloud:2.2.0.201710181101]
    at io.socket.thread.EventThread$2.run(EventThread.java:80) [244:org.openhab.io.openhabcloud:2.2.0.201710181101]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
    at java.lang.Thread.run(Thread.java:748) [?:?]
Caused by: java.net.SocketTimeoutException: connect timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?]
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:?]
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:?]
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:?]
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:?]
    at java.net.Socket.connect(Socket.java:589) ~[?:?]
    at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:673) ~[?:?]
    at sun.net.NetworkClient.doConnect(NetworkClient.java:175) ~[?:?]
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:463) ~[?:?]
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:558) ~[?:?]
    at sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264) ~[?:?]
    at sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:367) ~[?:?]
    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:191) ~[?:?]
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1138) ~[?:?]
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1032) ~[?:?]
    at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:177) ~[?:?]
    at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1316) ~[?:?]
    at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1291) ~[?:?]
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(HttpsURLConnectionImpl.java:250) ~[?:?]
    at io.socket.engineio.client.transports.PollingXHR$Request$1.run(PollingXHR.java:206) ~[?:?]
    ... 1 more
2017-11-22 16:31:03.436 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = ********-****-****-****-************, base URL = http://localhost:8080)
2017-11-22 16:31:27.842 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = ********-****-****-****-************, base URL = http://localhost:8080)
...
2017-11-23 17:13:47.544 [INFO ] [io.openhabcloud.internal.CloudClient] - Shutting down openHAB Cloud service connection
2017-11-23 17:13:47.565 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = ********-****-****-****-************, base URL = http://localhost:8080)
2017-11-23 17:13:54.207 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = ********-****-****-****-************, base URL = http://localhost:8080)

From Cloud events

Today at 5:13 PM    openhab online
Yesterday at 4:37 PM    openhab offline
ptee132 commented 6 years ago

similar problem here: 2017-11-25 10:28:27.468 [ERROR] [io.openhabcloud.internal.CloudClient] - Socket.IO error: io.socket.engineio.client.EngineIOException: xhr poll error 2017-11-25 10:28:27.478 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = 01a9e9cf-905a-485b-8629-7576b8a822da, base URL = http://localhost:8080)

and after a while: 2017-11-25 10:31:55.439 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = 01a9e9cf-905a-485b-8629-7576b8a822da, base URL = http://localhost:8080)

any new insights? Thanks, Patrick

Pavion commented 6 years ago

Same error here with first regular occurrence about six hours ago. Trying to access my openHAB server through openHAB Cloud results in 502 Bad Gateway. Already reinstalled cloud service, checked secret keys, restarted openHAB, to no avail. Wish you good luck with finding the error! Best regards Pav

ghost commented 6 years ago

There's a discussion going on at https://community.openhab.org/t/openhab-2-losing-cloud-connection/36550/8 describing the error.

There's also an SSL certificate validation error for www.myopenhab.org which uses a wrong certificate.

kubawolanin commented 6 years ago

I'm experiencing the same issue for over a month of trial. Tried with a fresh account, generated new UUID and secret files.

I've observed that when I try to access https://myopenhab.org/start/index, in my logs I see "System started" rules being executed. It may be a coincidence, though.

More details in this post: https://community.openhab.org/t/myopenhab-socket-io-error-app-not-working/34420/6?u=kubawolanin

cc: @marziman @digitaldan

wborn commented 5 years ago

This is still an issue. See also my https://github.com/openhab/openhab2-addons/pull/4500#issuecomment-451695654:


It looks like the Cloud is not properly keeping track of offline/online instances.

According to the CloudClient logging the connection should still be up after several disconnects:

2019-01-05 19:38:59.570 [DEBUG] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected
2019-01-05 19:38:59.570 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, base URL = http://localhost:8080)
2019-01-05 19:39:03.892 [ERROR] [io.openhabcloud.internal.CloudClient] - Socket.IO error: {}
io.socket.engineio.client.EngineIOException: xhr poll error
    at io.socket.engineio.client.Transport.onError(Transport.java:63) [246:org.openhab.io.openhabcloud:2.5.0.201901022358]
    at io.socket.engineio.client.transports.PollingXHR.access$100(PollingXHR.java:26) [246:org.openhab.io.openhabcloud:2.5.0.201901022358]
    at io.socket.engineio.client.transports.PollingXHR$6$1.run(PollingXHR.java:140) [246:org.openhab.io.openhabcloud:2.5.0.201901022358]
    at io.socket.thread.EventThread$2.run(EventThread.java:80) [246:org.openhab.io.openhabcloud:2.5.0.201901022358]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
    at java.lang.Thread.run(Thread.java:748) [?:?]
Caused by: java.io.IOException: 502
    at io.socket.engineio.client.transports.PollingXHR$Request$1.onResponse(PollingXHR.java:234) ~[?:?]
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) ~[?:?]
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[?:?]
    ... 3 more
2019-01-05 19:39:03.897 [DEBUG] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected
2019-01-05 19:39:03.897 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, base URL = http://localhost:8080)
2019-01-05 19:39:13.426 [ERROR] [io.openhabcloud.internal.CloudClient] - Socket.IO error: {}
io.socket.engineio.client.EngineIOException: xhr poll error
    at io.socket.engineio.client.Transport.onError(Transport.java:63) [246:org.openhab.io.openhabcloud:2.5.0.201901022358]
    at io.socket.engineio.client.transports.PollingXHR.access$100(PollingXHR.java:26) [246:org.openhab.io.openhabcloud:2.5.0.201901022358]
    at io.socket.engineio.client.transports.PollingXHR$6$1.run(PollingXHR.java:140) [246:org.openhab.io.openhabcloud:2.5.0.201901022358]
    at io.socket.thread.EventThread$2.run(EventThread.java:80) [246:org.openhab.io.openhabcloud:2.5.0.201901022358]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
    at java.lang.Thread.run(Thread.java:748) [?:?]
Caused by: java.io.IOException: 502
    at io.socket.engineio.client.transports.PollingXHR$Request$1.onResponse(PollingXHR.java:234) ~[?:?]
    at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206) ~[?:?]
    at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) ~[?:?]
    ... 3 more
2019-01-05 19:39:13.430 [DEBUG] [io.openhabcloud.internal.CloudClient] - Socket.IO disconnected
2019-01-05 19:39:13.431 [INFO ] [io.openhabcloud.internal.CloudClient] - Disconnected from the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, base URL = http://localhost:8080)
2019-01-05 19:39:21.398 [DEBUG] [io.openhabcloud.internal.CloudClient] - Socket.IO connected
2019-01-05 19:39:21.398 [INFO ] [io.openhabcloud.internal.CloudClient] - Connected to the openHAB Cloud service (UUID = xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx, base URL = http://localhost:8080)

Also if I raise logging on io.socket there is a connection because it ping/pongs normally:

2019-01-05 21:59:10.352 [DEBUG] [io.socket.engineio.client.Socket    ] - writing ping packet - expecting pong within 60000ms
2019-01-05 21:59:10.353 [DEBUG] [io.socket.engineio.client.Socket    ] - flushing 1 packets in socket
2019-01-05 21:59:10.379 [DEBUG] [io.socket.engineio.client.Socket    ] - socket received: type 'pong', data 'null'
2019-01-05 21:59:35.382 [DEBUG] [io.socket.engineio.client.Socket    ] - writing ping packet - expecting pong within 60000ms
2019-01-05 21:59:35.384 [DEBUG] [io.socket.engineio.client.Socket    ] - flushing 1 packets in socket
2019-01-05 21:59:35.418 [DEBUG] [io.socket.engineio.client.Socket    ] - socket received: type 'pong', data 'null'

The cloud keeps showing:

offline


As soon as you bundle:restart the openHAB Cloud Connector Bundle the Cloud again properly registers the instance as online:

online

There are already a few community topics and issues about this:


The openHAB Cloud Notification Actions also work while the Cloud thinks the instance is offline. This is another indicator that the connection is working properly (see community topic).

Can you have a look at this @digitaldan, @marziman?

rbi commented 5 years ago

I had the same issue:

2019-05-05 10:52:17.548 [ERROR] [io.openhabcloud.internal.CloudClient] - Socket.IO error: {}
io.socket.engineio.client.EngineIOException: xhr poll error
        at io.socket.engineio.client.Transport.onError(Transport.java:71) [246:org.openhab.io.openhabcloud:2.5.0.M1]
        at io.socket.engineio.client.transports.PollingXHR.access$100(PollingXHR.java:19) [246:org.openhab.io.openhabcloud:2.5.0.M1]
        at io.socket.engineio.client.transports.PollingXHR$6$1.run(PollingXHR.java:124) [246:org.openhab.io.openhabcloud:2.5.0.M1]
        at io.socket.thread.EventThread$2.run(EventThread.java:80) [246:org.openhab.io.openhabcloud:2.5.0.M1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:?]
        at java.lang.Thread.run(Thread.java:748) [?:?]
Caused by: java.io.IOException: 502
        at io.socket.engineio.client.transports.PollingXHR$Request$1.run(PollingXHR.java:218) ~[?:?]
        ... 1 more

Enabling debug log for io.socket revealted that the client tried to access this URL:

https://myopenhab.org/socket.io/?EIO=3&transport=polling&sid=<my_secret_sid>

The only thing that helped was to delete my openHAB cloud account and recreate it. Before that I executed the "dangerous action" "Delete Items and Events". Maybe the issue is related to that.

Edit: Recreating didn't help. I have 502 errors again. Just wasted another day with OpenHAB debugging. :-(

matushromada commented 4 years ago

I'm running the latest stable 2.4. I'm having the same issue:

Restarting the IO Cloud service temporarily resolves the problem until next glitch on Cloud server.

DavideCenc commented 4 years ago

same issue on latest stable 2.4:

giusanny commented 3 years ago

Follow, same issue here. Is there any workaround using some rules ?

javamaster10000 commented 3 years ago

Seem to have a similar issue. No disconnect message in openhab.log but myopenhab.org shows offline

codefalk commented 2 years ago

Seem to have a similar issue. No disconnect message in openhab.log but myopenhab.org shows offline

same here. And btw, why is "base URL = http://localhost:6080" logged in my openhab.log with "[INFO] io.openhabcloud.internal.CloudClient" messages? Should this be "https://myopenhab.org"?

System: OH 3.2.0.M1 on docker.´

lolodomo commented 2 years ago

I got the same issue last week as described in the first post. Logs are showing disconnections and reconnections but in fact remote access from Android app is not possible.

matushromada commented 2 years ago

It seems that there is still a bug that prevents an OH instance to reconnect to the cloud service after the cloud service had a problem. The workaround is to manually restart the OH connector or the entire OH instance.

lolodomo commented 2 years ago

Yes the restart is working but not obvious when you are in vacation, far from your local network.

ssalonen commented 2 years ago

This is still present (May 2022) with openHAB 3.2.0

It is exactly as @wborn said above 2 years ago https://github.com/openhab/openhab-cloud/issues/134#issuecomment-459117914

Two confirmed cases in community forum (https://community.openhab.org/t/openhab-cloud-connection-stops-working-does-not-try-to-reconnect/133608/34?u=ssalonen) tells that ping/pong is successfully being exchanged from client point of view. Still, myopenhab.org shows client being offline

ssalonen commented 1 year ago

Server side fixes recently: https://github.com/openhab/openhab-cloud/pull/381

ondrej1024 commented 1 year ago

Is the server side fix already online on myopenhab.org? And do we also need a fix on the client side? I am still having this issue with OH 3.1, no patches installed.

cherub-i commented 1 year ago

+1 to ondrej1024's question. @digitaldan, @ssalonen: would you be so kind to give us a quick update here?

I am having problems with disconnects too and I am hoping that they get resolved by this change.

Also: is there a way for us as users to see, which version/tag/commit is running on myopenhab.org? I looked around in the UI and in the HTML-source but did not find any version information.

ssalonen commented 1 year ago

I believe the servers have been updated but the problem remains I am afraid.

wborn commented 1 year ago

Unfortunately I also still ran into this issue multiple times this week. :worried:

ondrej1024 commented 1 year ago

This summer I was away from home for some weeks and never had an issue connecting remotely with the OpenHAB Android app to my OpenHAB instance at home. About a week ago the problems started and I got the connection errors every day. However now it's been working fine for at least 3 days. So maybe some updates have been done on the server.

ssalonen commented 1 year ago

Recent up-to-date summary of the issues from community user seanch https://community.openhab.org/t/openhab-cloud-connection-stops-working-does-not-try-to-reconnect/133608/54?u=ssalonen

As discussed in this thread, the client reconnects and receives responses from the server successfully (ping/pong). Even notifications work.

However, server consider client offline and access via myopenhab.org is not working.

Today I had the same problem happen again. Same as the last time:

Ping/pongs 100.0000% ok. I checked over 24 hours back in the log, and there was nothing wrong.

openhab app could not connect. Error message: “Your openHAB server is offline while the cloud instance is running.”

myopenhab.org 1 website says: " Your openHAB is not online. Please check if your installation is running or recheck the openHAB settings in your account."

Sending a notification did in fact work. It arrived as a notification on my phone instantly and could be display in the openhab app, however nothing else in the app worked as described above. I have a button in my house that has an associated rule that executes this openhab script code: sendNotification(“secret@email.com”, “Test notification.”)

Per usual, clicking the “Save” in the OpenHAB cloud menu fixed it all immediately.

FrankR59 commented 1 year ago

Any news here? My openHAB cloud connection went down yesterday several times. Now since 24 hrs it does not come up anymore. Nothing helped: not a restart of my openHAB machine (linux), not several restarts of openHAB, not an uninstall/reinstall of the binding, not dozens of restarts of the bundle via karaf. Internet connection of my openHAB machine is ok (pings to myopenhab.org shows 11.5 ms).

ssalonen commented 1 year ago

For me the connection works. There is occasional issues (needing manual action to recover), as described in this thread.

However, the iOS testflight app now does not seem to work, keeps on loading. Locally the app works flawlessly. This could be an regression with the app though since browser access via myopenhab works (EDIT: iOS issues were unrelated. Clean reinstall of app worked)

FrankR59 commented 1 year ago

Started working for me again after I changed uuid and secret. Maybe I caused some trouble with these settings during my several attempts to restart the bundle. Sorry!

Flole998 commented 1 year ago

Can't we just check on every ping/pong if we are online in myopenhab and if we aren't kill the connection so the client reconnects? Not perfect but seems like a nice idea to prevent connections being stuck. So on every ping run the check and only respond if returns "online", otherwise not respond at all.