RubyDevInc / paho.mqtt.ruby

Eclipse Public License 1.0
31 stars 19 forks source link

Periodic disconnects #35

Open arizz96 opened 6 years ago

arizz96 commented 6 years ago

Hi, I configured the Paho client as persistent, but I'm periodically having reconnect attempts due to "no activity", why this happens? My Rails app is using the client to send a "ping" to my MQTT broken every 5sec. I'm attaching a log:

D, [2018-05-14T09:18:18.408485 #1] DEBUG -- : No activity period over timeout, disconnecting from mqtt.example.com.
D, [2018-05-14T09:18:18.408533 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-05-14T09:18:18.409094 #1] DEBUG -- : New reconnect attempt...
W, [2018-05-14T09:18:18.409137 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
W, [2018-05-14T09:18:18.411795 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
D, [2018-05-14T09:18:18.414599 #1] DEBUG -- : Connection accepted
D, [2018-05-14T09:18:18.414625 #1] DEBUG -- : No previous session found by server, starting a new one.
D, [2018-05-14T09:18:31.647347 #40] DEBUG -- : Checking if server is still alive...
I, [2018-05-14T09:18:31.652623 #40]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-14T09:19:13.647935 #40] DEBUG -- : Checking if server is still alive...
I, [2018-05-14T09:19:13.653228 #40]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-14T09:19:24.418275 #1] DEBUG -- : No activity period over timeout, disconnecting from mqtt.example.com.
D, [2018-05-14T09:19:24.418328 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-05-14T09:19:24.418711 #1] DEBUG -- : New reconnect attempt...
W, [2018-05-14T09:19:24.418760 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
W, [2018-05-14T09:19:24.421943 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
D, [2018-05-14T09:19:24.424065 #1] DEBUG -- : Connection accepted
D, [2018-05-14T09:19:24.424092 #1] DEBUG -- : No previous session found by server, starting a new one.
D, [2018-05-14T09:19:55.649051 #40] DEBUG -- : Checking if server is still alive...
I, [2018-05-14T09:19:55.654335 #40]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-14T09:20:30.425640 #1] DEBUG -- : No activity period over timeout, disconnecting from mqtt.example.com.
D, [2018-05-14T09:20:30.425693 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-05-14T09:20:30.426063 #1] DEBUG -- : New reconnect attempt...
W, [2018-05-14T09:20:30.426125 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
W, [2018-05-14T09:20:30.427931 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
D, [2018-05-14T09:20:30.430158 #1] DEBUG -- : Connection accepted
D, [2018-05-14T09:20:30.430184 #1] DEBUG -- : No previous session found by server, starting a new one.
D, [2018-05-14T09:20:37.652151 #40] DEBUG -- : Checking if server is still alive...
I, [2018-05-14T09:20:37.657470 #40]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-14T09:21:19.655463 #40] DEBUG -- : Checking if server is still alive...
I, [2018-05-14T09:21:19.660764 #40]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-14T09:21:36.434613 #1] DEBUG -- : No activity period over timeout, disconnecting from mqtt.example.com.
D, [2018-05-14T09:21:36.434675 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-05-14T09:21:36.434833 #1] DEBUG -- : New reconnect attempt...
W, [2018-05-14T09:21:36.434896 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
W, [2018-05-14T09:21:36.437511 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
D, [2018-05-14T09:21:36.439761 #1] DEBUG -- : Connection accepted
D, [2018-05-14T09:21:36.439802 #1] DEBUG -- : No previous session found by server, starting a new one.
D, [2018-05-14T09:22:01.655819 #40] DEBUG -- : Checking if server is still alive...
I, [2018-05-14T09:22:01.661134 #40]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
p-goudet commented 6 years ago

Thank you for this issues. I will check the ping function to find what is wrong.

p-goudet commented 6 years ago

Hello @arizz96,

In the version 1.0.10, I have tried to fix the ping bugs with seems to cause this bugs. Could you tell me if the version 1.0.10 fix it?

arizz96 commented 6 years ago

It's still happening:

D, [2018-05-18T19:00:07.185115 #1] DEBUG -- : No activity period over timeout, disconnecting from mqtt.example.com.
D, [2018-05-18T19:00:07.185163 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-05-18T19:00:07.185605 #1] DEBUG -- : New reconnect attempt...
W, [2018-05-18T19:00:07.185637 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
W, [2018-05-18T19:00:07.222626 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
D, [2018-05-18T19:00:07.224248 #1] DEBUG -- : Connection accepted
D, [2018-05-18T19:00:07.224275 #1] DEBUG -- : No previous session found by server, starting a new one.
D, [2018-05-18T19:01:13.226803 #1] DEBUG -- : No activity period over timeout, disconnecting from mqtt.example.com.
D, [2018-05-18T19:01:13.226843 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-05-18T19:01:13.227498 #1] DEBUG -- : New reconnect attempt...
W, [2018-05-18T19:01:13.227559 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
W, [2018-05-18T19:01:13.229872 #1]  WARN -- : Attempt to connect to host: mqtt.example.com
D, [2018-05-18T19:01:13.231446 #1] DEBUG -- : Connection accepted
D, [2018-05-18T19:01:13.231467 #1] DEBUG -- : No previous session found by server, starting a new one.
p-goudet commented 6 years ago

I could not reproduce this bug, could you give some samples to reproduce it? From the log, it seems that the client do not try keep the connection alive.

I tried the following code with produce the associated logs.

require 'paho-mqtt'

PahoMqtt.logger = 'paho_log'
@client = PahoMqtt::Client.new(persistent: true, keep_alive: 10)

@client.connect('localhost', 1883)

loop do
  sleep 5
end
D, [2018-05-22T15:22:36.074694 #21221] DEBUG -- : Attempt to connect to host: localhost...
D, [2018-05-22T15:22:36.076962 #21221] DEBUG -- : Attempt to connect to host: localhost...
D, [2018-05-22T15:22:36.079349 #21221] DEBUG -- : Connection accepted
D, [2018-05-22T15:22:36.079419 #21221] DEBUG -- : No previous session found by server, starting a new one.
D, [2018-05-22T15:22:43.082159 #21221] DEBUG -- : Checking if server is still alive...
I, [2018-05-22T15:22:43.084949 #21221]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-22T15:22:50.083467 #21221] DEBUG -- : Checking if server is still alive...
I, [2018-05-22T15:22:50.086314 #21221]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-22T15:22:57.087683 #21221] DEBUG -- : Checking if server is still alive...
I, [2018-05-22T15:22:57.090427 #21221]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-22T15:23:04.089836 #21221] DEBUG -- : Checking if server is still alive...
I, [2018-05-22T15:23:04.092382 #21221]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-22T15:23:11.090917 #21221] DEBUG -- : Checking if server is still alive...
I, [2018-05-22T15:23:11.093718 #21221]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-22T15:23:18.092230 #21221] DEBUG -- : Checking if server is still alive...
I, [2018-05-22T15:23:18.094976 #21221]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
arizz96 commented 6 years ago

Hi @p-goudet, here there are another portion of logs, which includes Pingresp, but the problem still there. The client is configured to be persistent and a keep_alive timer set to 10.

D, [2018-05-24T09:41:17.721391 #1] DEBUG -- : Connection accepted
D, [2018-05-24T09:41:17.721459 #1] DEBUG -- : No previous session found by server, starting a new one.
D, [2018-05-24T09:41:19.754757 #57] DEBUG -- : Checking if server is still alive...
I, [2018-05-24T09:41:19.757073 #57]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-24T09:41:26.756925 #57] DEBUG -- : Checking if server is still alive...
I, [2018-05-24T09:41:26.759211 #57]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-24T09:41:28.722551 #1] DEBUG -- : No activity is over timeout, disconnecting from mqtt.example.com.
D, [2018-05-24T09:41:28.722700 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-05-24T09:41:28.722920 #1] DEBUG -- : New reconnect attempt...
D, [2018-05-24T09:41:28.723012 #1] DEBUG -- : Attempt to connect to host: mqtt.example.com...
D, [2018-05-24T09:41:28.725222 #1] DEBUG -- : Attempt to connect to host: mqtt.example.com...
D, [2018-05-24T09:41:28.727889 #1] DEBUG -- : Connection accepted
D, [2018-05-24T09:41:28.727984 #1] DEBUG -- : No previous session found by server, starting a new one.
D, [2018-05-24T09:41:33.760341 #57] DEBUG -- : Checking if server is still alive...
I, [2018-05-24T09:41:33.762768 #57]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-24T09:41:39.728080 #1] DEBUG -- : No activity is over timeout, disconnecting from mqtt.example.com.
D, [2018-05-24T09:41:39.728182 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-05-24T09:41:39.728314 #1] DEBUG -- : New reconnect attempt...
D, [2018-05-24T09:41:39.728376 #1] DEBUG -- : Attempt to connect to host: mqtt.example.com...
D, [2018-05-24T09:41:39.730407 #1] DEBUG -- : Attempt to connect to host: mqtt.example.com...
D, [2018-05-24T09:41:39.733212 #1] DEBUG -- : Connection accepted
D, [2018-05-24T09:41:39.733300 #1] DEBUG -- : No previous session found by server, starting a new one.
D, [2018-05-24T09:41:40.761847 #57] DEBUG -- : Checking if server is still alive...
I, [2018-05-24T09:41:40.764188 #57]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-24T09:41:47.765031 #57] DEBUG -- : Checking if server is still alive...
I, [2018-05-24T09:41:47.767391 #57]  INFO -- : New packet PahoMqtt::Packet::Pingresp received.
D, [2018-05-24T09:41:50.736336 #1] DEBUG -- : No activity is over timeout, disconnecting from mqtt.example.com.
D, [2018-05-24T09:41:50.736415 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-05-24T09:41:50.736668 #1] DEBUG -- : New reconnect attempt...
p-goudet commented 6 years ago

Thank you very much, it seems to be related with updating of the last packet reception time. I will investigate on this side.

p-goudet commented 6 years ago

@arizz96 The version 1.0.12 have been released. Could you tell me if this issue still occur?

arizz96 commented 6 years ago

@p-goudet seems that nothing has changed (tried with qos = 0, 1, 2)

D, [2018-06-09T10:49:38.556136 #1] DEBUG -- : No activity is over timeout, disconnecting from mqtt.example.com.
D, [2018-06-09T10:49:38.556185 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-06-09T10:49:38.556296 #1] DEBUG -- : New reconnect attempt...
D, [2018-06-09T10:49:38.556337 #1] DEBUG -- : Attempt to connect to host: mqtt.example.com...
D, [2018-06-09T10:49:38.558921 #1] DEBUG -- : Connection accepted
D, [2018-06-09T10:49:38.558940 #1] DEBUG -- : No previous session found by server, starting a new one.
I, [2018-06-09T10:49:41.515844 #10962]  INFO -- : New packet PahoMqtt::Packet::Pubrec received.
I, [2018-06-09T10:49:41.520128 #10962]  INFO -- : New packet PahoMqtt::Packet::Pubcomp received.
I, [2018-06-09T10:49:46.517127 #10962]  INFO -- : New packet PahoMqtt::Packet::Pubrec received.
I, [2018-06-09T10:49:46.521426 #10962]  INFO -- : New packet PahoMqtt::Packet::Pubcomp received.
D, [2018-06-09T10:49:49.559460 #1] DEBUG -- : No activity is over timeout, disconnecting from mqtt.example.com.
D, [2018-06-09T10:49:49.559501 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-06-09T10:49:49.559593 #1] DEBUG -- : New reconnect attempt...
D, [2018-06-09T10:49:49.559624 #1] DEBUG -- : Attempt to connect to host: mqtt.example.com...
D, [2018-06-09T10:49:49.562288 #1] DEBUG -- : Connection accepted
D, [2018-06-09T10:49:49.562315 #1] DEBUG -- : No previous session found by server, starting a new one.
I, [2018-06-09T10:49:51.519212 #10962]  INFO -- : New packet PahoMqtt::Packet::Pubrec received.
I, [2018-06-09T10:49:51.523516 #10962]  INFO -- : New packet PahoMqtt::Packet::Pubcomp received.
I, [2018-06-09T10:49:56.516749 #10962]  INFO -- : New packet PahoMqtt::Packet::Pubrec received.
I, [2018-06-09T10:49:56.521104 #10962]  INFO -- : New packet PahoMqtt::Packet::Pubcomp received.
D, [2018-06-09T10:50:00.564928 #1] DEBUG -- : No activity is over timeout, disconnecting from mqtt.example.com.
D, [2018-06-09T10:50:00.564970 #1] DEBUG -- : Disconnecting from mqtt.example.com.
D, [2018-06-09T10:50:00.565071 #1] DEBUG -- : New reconnect attempt...
D, [2018-06-09T10:50:00.565109 #1] DEBUG -- : Attempt to connect to host: mqtt.example.com...
D, [2018-06-09T10:50:00.567765 #1] DEBUG -- : Connection accepted
D, [2018-06-09T10:50:00.567820 #1] DEBUG -- : No previous session found by server, starting a new one.
p-goudet commented 6 years ago

@arizz96 Could you tell me if you are setting up the keep_alive value for the client (default: 60s)?

arizz96 commented 6 years ago

@p-goudet yes, I left it to the default value

p-goudet commented 6 years ago

@arizz96 I have try to reproduce your issues but unfortunately I could not. Could you try the following code in your environment and post here the output? This is a simplified process of how the keep alive features is handled in the client.

require 'thread'

KEEP_ALIVE = 10
@stub_queue = Queue.new
@now = Time.now
puts "Starting time is #{@now}"
@timeout = @now + (KEEP_ALIVE * 1.1).ceil
@ping_timeout = @now + (KEEP_ALIVE * 0.7).ceil
@cnt = 0
@limit = 3

def stub_read
  begin
    @stub_queue.pop(true)
    @timeout = @now + (KEEP_ALIVE * 1.1).ceil
    puts "Update timeout to #{@timeout}"
  rescue ThreadError
  end
end

def stub_write
  unless @cnt >= @limit
    @stub_queue << "hoge"
    @ping_timeout = @now + (KEEP_ALIVE * 0.7).ceil
    puts "Update ping timeout to #{@ping_timeout}"
    @cnt += 1
  end
end

Thread.new do
  loop do
    stub_write
    stub_read
    sleep 1
  end
end

loop do
  @now = Time.now
  puts "Should send ping" if @ping_timeout < @now
  break if @timeout < @now
  sleep 1
end

puts "Ending because time is (#{@now}) and timeout is #{@timeout}"
oldgreen commented 6 years ago

@p-goudet the problem is in the (handler and) connection_helper.check_keep_alive. If you keep publishing messages with QoS 0 in the time frame of keepalive interval, then you update sender.last_ping_req, but you'll get no response (which is correct, see Table 3.4 in mqtt 3.1.1 spec) so you'll never update handler.last_ping_resp. So you send no pings in connection_helper.check_keep_alive and eventually the check timeout_resp <= now in that method triggers a disconnect.

Please read the spec paragraph 3.1.2.10:

The Keep Alive is [...] the maximum time interval that is permitted to elapse
between the point at which the Client finishes transmitting one Control Packet
and the point it starts sending the next.

Please note that here the text is about sent control packets, not received control packets.

But according to the spec, you are free to send the PINGREQ any time you want. So if you don't want to base the keepalive mechanism solely on sent control packets (you want to be sure that the MQTT server is capable to respond all the time), you may also send the PINGREQ if you've not seen any reply from the server recently. Then something like this may fix the issue (diff against v1.0.12):

diff --git a/lib/paho_mqtt/connection_helper.rb b/lib/paho_mqtt/connection_helper.rb
index 00a7de3..99f0107 100644
--- a/lib/paho_mqtt/connection_helper.rb
+++ b/lib/paho_mqtt/connection_helper.rb
@@ -145,7 +145,7 @@ module PahoMqtt

     def check_keep_alive(persistent, last_ping_resp, keep_alive)
       now = Time.now
-      timeout_req = (@sender.last_ping_req + (keep_alive * 0.7).ceil)
+      timeout_req = [@sender.last_ping_req, @handler.last_ping_resp].min + (keep_alive * 0.7).ceil
       if timeout_req <= now && persistent
         PahoMqtt.logger.debug("Checking if server is still alive...") if PahoMqtt.logger?
         @sender.send_pingreq
p-goudet commented 6 years ago

@oldgreen Thank you very for raising this point. Including the request time in the updating process of the response time would prevent to some periodic disconnect. I am reviewing this solution and I would surely add it in the next version.

oldgreen commented 6 years ago

@p-goudet there is one more thing to consider with the patch i've proposed in my previous comment: there is a risk of sending pings in a rapid succession if the server does not respond quickly to the first one or the network/connection is otherwise slow. This is because the client.mqtt_loop is tight. I've tested the patch with a setup where the server (mosquitto) is on the same host and there I see only one ping/resp at a time though ...

EDIT: I wrote:

This is because the client.mqtt_loop is tight.

But I meant: This is because the client.mqtt_loop is run in a tight loop.

arizz96 commented 5 years ago

@p-goudet any news here? Is it fixed now?