shoenig / go-mqtt

A development fork of the Eclipse Paho Go MQTT client
http://git.eclipse.org/c/paho/org.eclipse.paho.mqtt.golang.git/
Eclipse Public License 1.0
13 stars 6 forks source link

An idle client never notices server is gone #23

Open shoenig opened 10 years ago

shoenig commented 10 years ago

I left a client connected while killing a server, and it never noticed the server went missing (even after ~3 minutes). I suspect if I had it publish a message it would have stopped, but the ping mechanism doesn't seem to recognize a broken connection.

shoenig commented 10 years ago

I can't seem to reproduce this with mosquitto, and debugging something like this on IMA is very tedious =\

shoenig commented 10 years ago

Well, it seems after ~2 hours of idling, the server disconnected the client, but not with a timed out message - it says it received a DISCONNECT from the client (id=subber, the other client id was some client on my phone). It seems the client didn't receive a PINGRESP and sent a DISCONNECT.

So.. why did it not receive a PINGRESP? It only goes into this disconnect phase if a PINGRESP is not received for 2x the timeout period. We can actually see in this client trace that the PINGRESP wasn't received in the last 60 seconds. (timeout was 30)

[pinger]   20:57:07.24329 [subber] last contact: 31 (timeout: 30)
[pinger]   20:57:07.24335 [subber] keepalive sending ping
[net]      20:57:07.24339 [subber] obound priority msg to write, type 12
[net]      20:57:07.24352 [subber] outgoing waiting for an outbound message
[net]      20:57:07.29431 [subber] incoming read 2 bytes of network data
[net]      20:57:07.29437 [subber] data: [208 0]
[net]      20:57:07.29439 [subber] incoming received inbound message, type 13
[net]      20:57:07.29441 [subber] incoming waiting for network data
[net]      20:57:07.29444 [subber] logic got msg on ibound, type 13
[net]      20:57:07.29446 [subber] received pingresp
[net]      20:57:07.29450 [subber] logic waiting for msg on ibound
[pinger]   20:57:08.24352 [subber] last contact: 1 (timeout: 30)
[pinger]   20:57:09.24372 [subber] last contact: 2 (timeout: 30)
[pinger]   20:57:10.24392 [subber] last contact: 3 (timeout: 30)
[pinger]   20:57:11.24412 [subber] last contact: 4 (timeout: 30)
[pinger]   20:57:12.24432 [subber] last contact: 5 (timeout: 30)
[pinger]   20:57:13.24451 [subber] last contact: 6 (timeout: 30)
[pinger]   20:57:14.24471 [subber] last contact: 7 (timeout: 30)
[pinger]   20:57:15.24490 [subber] last contact: 8 (timeout: 30)
[pinger]   20:57:16.24510 [subber] last contact: 9 (timeout: 30)
[pinger]   20:57:17.24529 [subber] last contact: 10 (timeout: 30)
[pinger]   20:57:18.24552 [subber] last contact: 11 (timeout: 30)
[pinger]   20:57:19.24572 [subber] last contact: 12 (timeout: 30)
[pinger]   20:57:20.24592 [subber] last contact: 13 (timeout: 30)
[pinger]   20:57:21.24612 [subber] last contact: 14 (timeout: 30)
[pinger]   20:57:22.24633 [subber] last contact: 15 (timeout: 30)
[pinger]   20:57:23.24653 [subber] last contact: 16 (timeout: 30)
[pinger]   20:57:24.24672 [subber] last contact: 17 (timeout: 30)
[pinger]   20:57:25.24692 [subber] last contact: 18 (timeout: 30)
[pinger]   20:57:26.24711 [subber] last contact: 19 (timeout: 30)
[pinger]   20:57:27.24731 [subber] last contact: 20 (timeout: 30)
[pinger]   20:57:28.24750 [subber] last contact: 21 (timeout: 30)
[pinger]   20:57:29.24769 [subber] last contact: 22 (timeout: 30)
[pinger]   20:57:30.24790 [subber] last contact: 23 (timeout: 30)
[pinger]   20:57:31.24809 [subber] last contact: 24 (timeout: 30)
[pinger]   20:57:32.24829 [subber] last contact: 25 (timeout: 30)
[pinger]   20:57:33.24850 [subber] last contact: 26 (timeout: 30)
[pinger]   20:57:34.24872 [subber] last contact: 27 (timeout: 30)
[pinger]   20:57:35.24891 [subber] last contact: 28 (timeout: 30)
[pinger]   20:57:36.24910 [subber] last contact: 29 (timeout: 30)
[pinger]   20:57:37.24930 [subber] last contact: 30 (timeout: 30)
[pinger]   20:57:38.24950 [subber] last contact: 31 (timeout: 30)
[pinger]   20:57:38.24955 [subber] keepalive sending ping
[net]      20:57:38.24958 [subber] obound priority msg to write, type 12
[net]      20:57:38.24967 [subber] outgoing waiting for an outbound message
[net]      20:57:38.29991 [subber] incoming read 2 bytes of network data
[net]      20:57:38.29996 [subber] data: [208 0]
[net]      20:57:38.29999 [subber] incoming received inbound message, type 13
[net]      20:57:38.30001 [subber] incoming waiting for network data
[net]      20:57:38.30004 [subber] logic got msg on ibound, type 13
[net]      20:57:38.30006 [subber] received pingresp
[net]      20:57:38.30007 [subber] logic waiting for msg on ibound
[pinger]   20:57:39.24972 [subber] last contact: 1 (timeout: 30)
[pinger]   20:57:40.24991 [subber] last contact: 2 (timeout: 30)
[pinger]   20:57:41.25011 [subber] last contact: 3 (timeout: 30)
[pinger]   20:57:42.25031 [subber] last contact: 4 (timeout: 30)
[pinger]   20:57:43.25051 [subber] last contact: 5 (timeout: 30)
[pinger]   20:57:44.25070 [subber] last contact: 6 (timeout: 30)
[pinger]   20:57:45.25089 [subber] last contact: 7 (timeout: 30)
[pinger]   20:57:46.25108 [subber] last contact: 8 (timeout: 30)
[pinger]   20:57:47.25129 [subber] last contact: 9 (timeout: 30)
[pinger]   20:57:48.25151 [subber] last contact: 10 (timeout: 30)
[pinger]   20:57:49.25168 [subber] last contact: 11 (timeout: 30)
[pinger]   20:57:50.25189 [subber] last contact: 12 (timeout: 30)
[pinger]   20:57:51.25209 [subber] last contact: 13 (timeout: 30)
[pinger]   20:57:52.25228 [subber] last contact: 14 (timeout: 30)
[pinger]   20:57:53.25249 [subber] last contact: 15 (timeout: 30)
[pinger]   20:57:54.25270 [subber] last contact: 16 (timeout: 30)
[pinger]   20:57:55.25289 [subber] last contact: 17 (timeout: 30)
[pinger]   20:57:56.25309 [subber] last contact: 18 (timeout: 30)
[pinger]   20:57:57.25329 [subber] last contact: 19 (timeout: 30)
[pinger]   20:57:58.25351 [subber] last contact: 20 (timeout: 30)
[pinger]   20:57:59.25371 [subber] last contact: 21 (timeout: 30)
[pinger]   20:58:00.25392 [subber] last contact: 22 (timeout: 30)
[pinger]   20:58:01.25411 [subber] last contact: 23 (timeout: 30)
[pinger]   20:58:02.25430 [subber] last contact: 24 (timeout: 30)
[pinger]   20:58:03.25450 [subber] last contact: 25 (timeout: 30)
[pinger]   20:58:04.25470 [subber] last contact: 26 (timeout: 30)
[pinger]   20:58:05.25490 [subber] last contact: 27 (timeout: 30)
[pinger]   20:58:06.25509 [subber] last contact: 28 (timeout: 30)
[pinger]   20:58:07.25529 [subber] last contact: 29 (timeout: 30)
[pinger]   20:58:08.25550 [subber] last contact: 30 (timeout: 30)
[pinger]   20:58:09.25569 [subber] last contact: 31 (timeout: 30)
[pinger]   20:58:09.25574 [subber] keepalive sending ping
[net]      20:58:09.25578 [subber] obound priority msg to write, type 12
[net]      20:58:09.25586 [subber] outgoing waiting for an outbound message
[net]      20:58:09.31118 [subber] incoming read 2 bytes of network data
[net]      20:58:09.31123 [subber] data: [208 0]
[net]      20:58:09.31125 [subber] incoming received inbound message, type 13
[net]      20:58:09.31126 [subber] incoming waiting for network data
[net]      20:58:09.31129 [subber] logic got msg on ibound, type 13
[net]      20:58:09.31131 [subber] received pingresp
[net]      20:58:09.31132 [subber] logic waiting for msg on ibound
[pinger]   20:58:10.25592 [subber] last contact: 1 (timeout: 30)
[pinger]   20:58:11.25612 [subber] last contact: 2 (timeout: 30)
[pinger]   20:58:12.25632 [subber] last contact: 3 (timeout: 30)
[pinger]   20:58:13.25651 [subber] last contact: 4 (timeout: 30)
[pinger]   20:58:14.25670 [subber] last contact: 5 (timeout: 30)
[pinger]   20:58:15.25690 [subber] last contact: 6 (timeout: 30)
[pinger]   20:58:16.25709 [subber] last contact: 7 (timeout: 30)
[pinger]   20:58:17.25729 [subber] last contact: 8 (timeout: 30)
[pinger]   20:58:18.25751 [subber] last contact: 9 (timeout: 30)
[pinger]   20:58:19.25771 [subber] last contact: 10 (timeout: 30)
[pinger]   20:58:20.25791 [subber] last contact: 11 (timeout: 30)
[pinger]   20:58:21.25810 [subber] last contact: 12 (timeout: 30)
[pinger]   20:58:22.25830 [subber] last contact: 13 (timeout: 30)
[pinger]   20:58:23.25849 [subber] last contact: 14 (timeout: 30)
[pinger]   20:58:24.25869 [subber] last contact: 15 (timeout: 30)
[pinger]   20:58:25.25889 [subber] last contact: 16 (timeout: 30)
[pinger]   20:58:26.25908 [subber] last contact: 17 (timeout: 30)
[pinger]   20:58:27.25927 [subber] last contact: 18 (timeout: 30)
[pinger]   20:58:28.25950 [subber] last contact: 19 (timeout: 30)
[pinger]   20:58:29.25971 [subber] last contact: 20 (timeout: 30)
[pinger]   20:58:30.25990 [subber] last contact: 21 (timeout: 30)
[pinger]   20:58:31.26010 [subber] last contact: 22 (timeout: 30)
[pinger]   20:58:32.26027 [subber] last contact: 23 (timeout: 30)
[pinger]   20:58:33.26048 [subber] last contact: 24 (timeout: 30)
[pinger]   20:58:34.26069 [subber] last contact: 25 (timeout: 30)
[pinger]   20:58:35.26088 [subber] last contact: 26 (timeout: 30)
[pinger]   20:58:36.26107 [subber] last contact: 27 (timeout: 30)
[pinger]   20:58:37.26127 [subber] last contact: 28 (timeout: 30)
[pinger]   20:58:38.26150 [subber] last contact: 29 (timeout: 30)
[pinger]   20:58:39.26170 [subber] last contact: 30 (timeout: 30)
[pinger]   20:58:40.26190 [subber] last contact: 31 (timeout: 30)
[pinger]   20:58:40.26195 [subber] keepalive sending ping
[net]      20:58:40.26198 [subber] obound priority msg to write, type 12
[net]      20:58:40.26207 [subber] outgoing waiting for an outbound message
[net]      20:58:40.32535 [subber] incoming read 2 bytes of network data
[net]      20:58:40.32540 [subber] data: [208 0]
[net]      20:58:40.32542 [subber] incoming received inbound message, type 13
[net]      20:58:40.32544 [subber] incoming waiting for network data
[net]      20:58:40.32551 [subber] logic got msg on ibound, type 13
[net]      20:58:40.32553 [subber] received pingresp
[net]      20:58:40.32555 [subber] logic waiting for msg on ibound
[pinger]   20:58:41.26212 [subber] last contact: 1 (timeout: 30)
[pinger]   20:58:42.26231 [subber] last contact: 2 (timeout: 30)
[pinger]   20:58:43.26251 [subber] last contact: 3 (timeout: 30)
[pinger]   20:58:44.26270 [subber] last contact: 4 (timeout: 30)
[pinger]   20:58:45.26290 [subber] last contact: 5 (timeout: 30)
[pinger]   20:58:46.26311 [subber] last contact: 6 (timeout: 30)
[pinger]   20:58:47.26330 [subber] last contact: 7 (timeout: 30)
[pinger]   20:58:48.26350 [subber] last contact: 8 (timeout: 30)
[pinger]   20:58:49.26370 [subber] last contact: 9 (timeout: 30)
[pinger]   20:58:50.26390 [subber] last contact: 10 (timeout: 30)
[pinger]   20:58:51.26409 [subber] last contact: 11 (timeout: 30)
[pinger]   20:58:52.26428 [subber] last contact: 12 (timeout: 30)
[pinger]   20:58:53.26449 [subber] last contact: 13 (timeout: 30)
[pinger]   20:58:54.26469 [subber] last contact: 14 (timeout: 30)
[pinger]   20:58:55.26489 [subber] last contact: 15 (timeout: 30)
[pinger]   20:58:56.26508 [subber] last contact: 16 (timeout: 30)
[pinger]   20:58:57.26529 [subber] last contact: 17 (timeout: 30)
[pinger]   20:58:58.26551 [subber] last contact: 18 (timeout: 30)
[pinger]   20:58:59.26571 [subber] last contact: 19 (timeout: 30)
[pinger]   20:59:00.26591 [subber] last contact: 20 (timeout: 30)
[pinger]   20:59:01.26610 [subber] last contact: 21 (timeout: 30)
[pinger]   20:59:02.26630 [subber] last contact: 22 (timeout: 30)
[pinger]   20:59:03.26650 [subber] last contact: 23 (timeout: 30)
[pinger]   20:59:04.26670 [subber] last contact: 24 (timeout: 30)
[pinger]   20:59:05.26689 [subber] last contact: 25 (timeout: 30)
[pinger]   20:59:06.26708 [subber] last contact: 26 (timeout: 30)
[pinger]   20:59:07.26728 [subber] last contact: 27 (timeout: 30)
[pinger]   20:59:08.26749 [subber] last contact: 28 (timeout: 30)
[pinger]   20:59:09.26770 [subber] last contact: 29 (timeout: 30)
[pinger]   20:59:10.26791 [subber] last contact: 30 (timeout: 30)
[pinger]   20:59:11.26811 [subber] last contact: 31 (timeout: 30)
[pinger]   20:59:11.26817 [subber] keepalive sending ping
[net]      20:59:11.26820 [subber] obound priority msg to write, type 12
[net]      20:59:11.26830 [subber] outgoing waiting for an outbound message
[pinger]   20:59:12.26834 [subber] last contact: 1 (timeout: 30)
[pinger]   20:59:13.26854 [subber] last contact: 2 (timeout: 30)
[pinger]   20:59:14.26875 [subber] last contact: 3 (timeout: 30)
[pinger]   20:59:15.26895 [subber] last contact: 4 (timeout: 30)
[pinger]   20:59:16.26915 [subber] last contact: 5 (timeout: 30)
[pinger]   20:59:17.26935 [subber] last contact: 6 (timeout: 30)
[pinger]   20:59:18.26955 [subber] last contact: 7 (timeout: 30)
[pinger]   20:59:19.26974 [subber] last contact: 8 (timeout: 30)
[pinger]   20:59:20.26994 [subber] last contact: 9 (timeout: 30)
[pinger]   20:59:21.27014 [subber] last contact: 10 (timeout: 30)
[pinger]   20:59:22.27034 [subber] last contact: 11 (timeout: 30)
[pinger]   20:59:23.27054 [subber] last contact: 12 (timeout: 30)
[pinger]   20:59:24.27073 [subber] last contact: 13 (timeout: 30)
[pinger]   20:59:25.27093 [subber] last contact: 14 (timeout: 30)
[pinger]   20:59:26.27113 [subber] last contact: 15 (timeout: 30)
[pinger]   20:59:27.27133 [subber] last contact: 16 (timeout: 30)
[pinger]   20:59:28.27154 [subber] last contact: 17 (timeout: 30)
[pinger]   20:59:29.27175 [subber] last contact: 18 (timeout: 30)
[pinger]   20:59:30.27191 [subber] last contact: 19 (timeout: 30)
[pinger]   20:59:31.27210 [subber] last contact: 20 (timeout: 30)
[pinger]   20:59:32.27231 [subber] last contact: 21 (timeout: 30)
[pinger]   20:59:33.27251 [subber] last contact: 22 (timeout: 30)
[pinger]   20:59:34.27271 [subber] last contact: 23 (timeout: 30)
[pinger]   20:59:35.27291 [subber] last contact: 24 (timeout: 30)
[pinger]   20:59:36.27310 [subber] last contact: 25 (timeout: 30)
[pinger]   20:59:37.27330 [subber] last contact: 26 (timeout: 30)
[pinger]   20:59:38.27353 [subber] last contact: 27 (timeout: 30)
[pinger]   20:59:39.27373 [subber] last contact: 28 (timeout: 30)
[pinger]   20:59:40.27392 [subber] last contact: 29 (timeout: 30)
[pinger]   20:59:41.27411 [subber] last contact: 30 (timeout: 30)
[pinger]   20:59:42.27427 [subber] last contact: 31 (timeout: 30)
[pinger]   20:59:42.27431 [subber] pingresp not received, disconnecting
[client]   20:59:42.27433 [subber] disconnected
[store]    20:59:42.27434 [subber] memorystore closed
[net]      20:59:42.27436 [subber] logic stopped
[net]      20:59:42.27437 [subber] obound priority msg to write, type 14
[net]      20:59:42.27440 [subber] outbound wrote disconnect, now closing connection
[net]      20:59:42.27453 [subber] incoming read 0 bytes of network data
[net]      20:59:42.27455 [subber] incoming stopped
[pinger]   20:59:43.27451 [subber] keepalive stopped
2013-11-28T08:18:30.079-06:00 CWLNA1117 info Connection imaserver 25310: Create mqtt connection: ConnectionID=12 ClientID="subber" Endpoint="GoMqttEP1" UserID="".
2013-11-28T10:03:50.809-06:00 CWLNA1111 notice Connection imaserver 25310: Closing TCP connection: ConnectionID=11 ClientID=".1385648169577" Protocol=mqtt-tcp Endpoint="GoMqttEP1" UserID="" Uptime=6460 RC=-1 Clean=0 Reason="The connection has timed out" ReadBytes=380 ReadMsg=11 WriteBytes=212 WriteMsg=0 LostMsg=0.
2013-11-28T20:59:42.323-06:00 CWLNA1111 notice Connection imaserver 25310: Closing TCP connection: ConnectionID=12 ClientID="subber" Protocol=mqtt-tcp Endpoint="GoMqttEP1" UserID="" Uptime=45672 RC=0 Clean=1 Reason="Disconnect" ReadBytes=2,981 ReadMsg=0 WriteBytes=2,963 WriteMsg=1 LostMsg=0.