OpenZWave / qt-openzwave

QT5 Wrapper for OpenZWave
GNU Lesser General Public License v3.0
105 stars 30 forks source link

MQTT client timeout #140

Open eriklindqvist opened 4 years ago

eriklindqvist commented 4 years ago

I am running docker images ozwdaemon:latest and eclipse-mosquitto:latest as of yesterday on a Raspberry Pi together with Home Assistant 0.113.1. I have about 90 devices and over 500 entities.

As long as I don't really do anything, it's running pretty stable. However, using OZWAdmin-0.1.74, if I push the z-wave network a bit too far, such as healing and/or refreshing too many nodes at the same time, basically clogging the network with messages, the ozwdaemon mqtt client doesn't seem to be able to keep up. All entities in Home Assistant becomes unavailable, and I see the following in the mosquitto logs:

Client qt-openzwave-1 has exceeded timeout, disconnecting.

So I restart the ozwdaemon docker instance, and I see the folling in the Mosquitto logs:

  New connection from 172.18.0.2 on port 1883.
  Socket error on client <unknown>, disconnecting.
  New connection from 172.18.0.2 on port 1883.
  New client connected from 172.18.0.2 as qt-openzwave-1 (p2, c1, k60).

Then it works for just about under two minutes before it gets disconnected again:

  Client qt-openzwave-1 has exceeded timeout, disconnecting.

From what I understand (please, correct me if I'm wrong) that "k60"-part in the Mosquitto logs means "keepalive = 60", i.e. the MQTT client tells the broker when connecting that it will stay in touch with a ping message at least once every minute, and if that doesn't happen, the client will be disconnected.

I increased logging in mosquitto (by setting "log_type all" in mosquitto.conf) and also started ozwdaemon with -e QT_LOGGING_RULES="*.debug=false;ozw.mqtt.publisher.debug=true"

and I can see in the mosquitto logs

  ...
  Received PUBLISH from qt-openzwave-1 (d0, q0, r1, m0, 'OpenZWave/1/node/1/instance/1/commandclass/32/value/562949970722835/', ... (478 bytes))
  Sending PUBLISH to auto-3E2A0E60-FB05-A814-00F1-5DE1DEFD51A0 (d0, q0, r0, m0, 'OpenZWave/1/node/1/instance/1/commandclass/32/value/562949970722835/', ... (478 bytes))
  Sending PUBLISH to qt-openzwave-1 (d0, q0, r0, m0, 'OpenZWave/1/node/1/instance/1/commandclass/32/value/562949970722835/', ... (478 bytes))
  Received PINGREQ from auto-3E2A0E60-FB05-A814-00F1-5DE1DEFD51A0
  Sending PINGRESP to auto-3E2A0E60-FB05-A814-00F1-5DE1DEFD51A0
  Client qt-openzwave-1 has exceeded timeout, disconnecting.

while ozwdaemon continues to print out hundreds of rows such as

  ...
  [ozw.mqtt.publisher] [debug]: Publishing Event valueAdded: 562952802893846
  ...

for several minutes until it realizes that the connection is gone:

  [ozw.mqtt.publisher] [debug]: Publishing Event valueRefreshed: 562950595969074
  [ozw.mqtt.publisher] [debug]: Publishing Event valueRefreshed: 72057594680475696
  [ozw.mqtt.publisher] [debug]: MQTT State Change "Disconnected" 
  [ozw.mqtt.publisher] [warning]: Exiting on Failure
  [ozw.mqtt.publisher] [warning]: MQTT Client Disconnnected
  [ozw.mqtt.publisher] [warning]: MQTT Client Error "Transport Invalid"

The only way I can get it to stay up is to remove/rename the ozwcache_0xf7b52c8f.xml file and restart, but it doesn't feel like a good solution.

Any ideas on what's going on?