odelot / aws-mqtt-websockets

Implementation of a middleware to use AWS MQTT service through websockets, aiming the ESP8266 plataform
GNU Lesser General Public License v3.0
231 stars 67 forks source link

Can't get disconnected info when internet disconnection #32

Closed handgear closed 6 years ago

handgear commented 6 years ago

Hello, Is there a way to get disconnect info when internet is off-line? I'm trying to get connected status in loop by MQTT::Client->isConnected() and AWSWebSocketClient->connected().

In normal case it send & receive heartbeat(?) once a minute.

[WS][0][sendFrame]` ------- send massage frame -------
[WS][0][sendFrame] fin: 1 opCode: 2 mask: 1 length: 2 headerToPayload: 0
[WS][0][sendFrame] pack to one TCP package...
[write] n: 8 t: 186387
[WS][0][sendFrame] sending Frame Done (9153us).

[WS][0][handleWebsocketWaitFor] size: 2 cWsRXsize: 0
[readCb] n: 2 t: 186421
[WS][0][handleWebsocketWaitFor][readCb] size: 2 ok: 1
[WS][0][handleWebsocket] ------- read massage frame -------
[WS][0][handleWebsocket] fin: 1 rsv1: 0 rsv2: 0 rsv3 0  opCode: 2
[WS][0][handleWebsocket] mask: 0 payloadLen: 2
[readCb] n: 2 t: 186436

When the internet is off-line, after about 1min, it send below and never send it again. (situation pretend sudden disconnection after normal state)

[WS][0][sendFrame] ------- send massage frame -------
[WS][0][sendFrame] fin: 1 opCode: 2 mask: 1 length: 2 headerToPayload: 0
[WS][0][sendFrame] pack to one TCP package...
[write] n: 8 t: 246673
[WS][0][sendFrame] sending Frame Done (9062us).

In this situation(off-line), MQTT::Client->isConnected() and AWSWebSocketClient->connected() are both true during loop. It actually detects the disconnection after the internet recovered.

log after internet recovered:

[WS-Client] connection lost.
[WS-Client] client disconnected.
[WS-Client] connect wss...
[WS-Client] connected to qwertyqwerty.iot.ap-northeast-2.amazonaws.com:443.
[WS-Client][sendHeader] sending header...
[WS-Client][sendHeader] handshake GET /mqtt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=qwertyqwerty%2Fqwertyqwerty%2Fiotdevicegateway%2Faws4_request&X-Amz-Date=20180108T061032Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&X-Amz-Signature=qwertyqwerty HTTP/1.1
Host: qwertyqwerty.iot.ap-northeast-2.amazonaws.com:443
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: qwertyqwerty==
Origin: file://
User-Agent: arduino-WebSocket-Client

[write] n: 535 t: 456279
[WS-Client][sendHeader] sending header... Done (58594us).

in this point MQTT::Client->isConnected() == true, AWSWebSocketClient->connected() == false in loop

458176 - conn: 2 - (13376)
[WS][0][handleWebsocket] clientDisconnect code: 1000
[WS-Client] client disconnected.
[WS-Client] connect wss...
[WS-Client] connected to qwertyqwerty.ap-northeast-2.amazonaws.com:443.
[WS-Client][sendHeader] sending header...
[WS-Client][sendHeader] handshake GET /mqtt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=qwertyqwerty%2Fap-northeast-2%2Fiotdevicegateway%2Faws4_request&X-Amz-Date=20180108T061800Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&X-Amz-Signature=qwertyqwerty HTTP/1.1
Host: qwertyqwerty.iot.ap-northeast-2.amazonaws.com:443
Connection: Upgrade
Upgrade: websocket
Sec-WebSocket-Version: 13
Sec-WebSocket-Key: qwertyqwerty==
Sec-WebSocket-Protocol: mqtt
Origin: file://
User-Agent: arduino-WebSocket-Client

and after this, it init webSocket and connect MQTT

Is there a way to get disconnect info when internet is off-line?


I tried with latest eclipse/paho.mqtt.embedded-c (release vertion 1.1.0) wonder this was my case(https://github.com/eclipse/paho.mqtt.embedded-c/commit/550912494d54170d862d99c8a8899e24c9e93c6c)

interestingly with this MQTTClient.h, MQTT::Client->isConnected() is false after MQTT connection. So it reconnect session after every 1min.

odelot commented 6 years ago

When we had fixed the 2.3.0 version (https://github.com/esp8266/Arduino/issues/3537) , the write function returns false after 25 tries and then the mqtt server was disconnected .

I made a quick test in the 2.4.0 (they release the final version a week ago) and saw that it does not disconnected from the server, but also does not enter in a endless loop as it was doing lately. I think they just flush the data they cannot send. As I dont need to have the server disconnected when there is no internet connection I didn't investigate further for now.

I am using the version 1.0.0 of Paho client, so I think version 1.1.0 is not the problem.

I've also noticed that after the first connection, all others connection closes fast. I thought it was a amazon issue, but as you said, it seems to be a paho client issue.

I will try to investigate this issue in the next days (although I cant promisse). Paho could have a timeout for the heartbeat message and disconnect automatically. Maybe we could try to use another client, as pubsubclient.

handgear commented 6 years ago

Thanks for your consideration. I tried to use ping as an alternative way to detect internet off-line and it is working well so far. this doesn't detect server failure but in most case, aws iot server seem to be reliable. quick solution...

I'll try with pubsubclient. :D

odelot commented 6 years ago

Think I fixed the disconnection issue. It is in a two day run with just 4 reconnections. Give it a try if you can.

handgear commented 6 years ago

The reconnection issue seems fixed. Thanks for your consideration.

I'm still using extra ping (https://github.com/dancol90/ESP8266Ping) to detect internet disconnection and this method is not bad. not really need to get info from MQTT object.

odelot commented 6 years ago

I am still getting disconnection issues. :-(

It seems that amazon is sending a disconnect message, but it does not make sense, because after I reset the device it stops from disconnect.

It disconnects after what it seems to be a disconnection message. The log I have is this:

[WS][0][handleWebsocketWaitFor] size: 2 cWsRXsize: 0 [readCb] n: 2 t: 15184727 [WS][0][handleWebsocketWaitFor][readCb] size: 2 ok: 1 [WS][0][handleWebsocket] ------- read massage frame ------- [WS][0][handleWebsocket] fin: 1 rsv1: 0 rsv2: 0 rsv3 0 opCode: 8 [WS][0][handleWebsocket] mask: 0 payloadLen: 0 [WS][0][handleWebsocket] get ask for close. Code: 1000 [WS][0][handleWebsocket] clientDisconnect code: 1000 [WS][0][sendFrame] ------- send massage frame ------- [WS][0][sendFrame] fin: 1 opCode: 8 mask: 0 length: 2 headerToPayload: 0

@handgear were you having this issue?

handgear commented 6 years ago

I'm also not sure where that clientDisconnect code: 1000 coming from. Currently using Paho ver 1.0.0 and dosn't seems to receive clientDisconnect code: 1000 (although I didn't trace the log long enough) clientDisconnect code: 1000 only appear after when internet was inaccessible and then recovered

when using latest paho library i get

[WS][0][handleWebsocketWaitFor] size: 2 cWsRXsize: 0 [readCb] n: 2 t: 112105 [WS][0][handleWebsocketWaitFor][readCb] size: 2 ok: 1 [WS][0][handleWebsocket] ------- read massage frame ------- [WS][0][handleWebsocket] fin: 1 rsv1: 0 rsv2: 0 rsv3 0 opCode: 8 [WS][0][handleWebsocket] mask: 0 payloadLen: 0 [WS][0][handleWebsocket] get ask for close. Code: 1000 [WS][0][handleWebsocket] clientDisconnect code: 1000 [WS][0][sendFrame] ------- send massage frame ------- [WS][0][sendFrame] fin: 1 opCode: 8 mask: 0 length: 2 headerToPayload: 0 [WS][0][sendFrame] pack to one TCP package... [write] n: 4 t: 112144 [write] not connected! [WS][0][sendFrame] sending Frame Done (18011us). [WS-Client] client disconnected.

odelot commented 6 years ago

@handgear I am still facing this problem. First connection last hours, but next ones just a couple of minutes.

Do you had some progress on it? I am trying to see the websocket layer logs to see why the server is asking to close the connection. (following the suggestion from https://github.com/Links2004/arduinoWebSockets/issues/294)

Other idea is to change paho client to pubsubclient as I already mention to you. This would try to isolate the mqtt client from this bizarre behavior