OpenZWave / qt-openzwave

QT5 Wrapper for OpenZWave
GNU Lesser General Public License v3.0
105 stars 30 forks source link

Container Shuts down with MQTT Broker is unavailable #20

Open DamianFlynn opened 4 years ago

DamianFlynn commented 4 years ago

Noticed that when I reboot the node running the MQTT broker

this is what i see, no crash dump is generated in this scenario

[20200117 23:35:50.847 UTC] [qt.mqtt.connection.verbose] [debug]: Received PUBLISH
[20200117 23:35:50.847 UTC] [qt.mqtt.connection.verbose] [debug]: Finalize PUBLISH: topic: QMqttTopicName("OpenZWave/1/node/32/statistics/")  payloadLength: 787
[20200117 23:35:51.529 UTC] [ozw.library] [debug]: Detail - Node: 25   Received: 0x01, 0x0c, 0x00, 0x04, 0x00, 0x19, 0x06, 0x31, 0x05, 0x01, 0x42, 0x06, 0x77, 0xee
[20200117 23:35:51.530 UTC] [ozw.library] [debug]: Detail - Node: 0
[20200117 23:35:51.530 UTC] [ozw.library] [info]: Info - Node: 25 Received SensorMultiLevel report from node 25, instance 1, Air Temperature: value=16.55C
[20200117 23:35:51.531 UTC] [ozw.library] [debug]: Detail - Node: 25 Refreshed Value: old value=16.47, new value=16.55, type=decimal
[20200117 23:35:51.532 UTC] [ozw.library] [debug]: Detail - Node: 25 Changes to this value are not verified
[20200117 23:35:51.532 UTC] [ozw.library] [debug]: Detail - Node: 25 Notification: ValueChanged
[20200117 23:35:51.541 UTC] [ozw.notifications] [debug]: Notification pvt_valueChanged:  281475401138194
[20200117 23:35:51.545 UTC] [qt.remoteobjects.models] [debug]: void QAbstractItemModelSourceAdapter::sourceDataChanged(const QModelIndex&, const QModelIndex&, const QVector<int>&) const start= (ModelIndex[row=580, column=0]) end= (ModelIndex[row=580, column=13]) neededRoles= QVector(0, 2, 3)
[20200117 23:35:51.545 UTC] [ozw.mqtt.publisher] [debug]: Publishing Event valueChanged: 281475401138194
[20200117 23:35:51.548 UTC] [ozw.mqtt.qt2js] [debug]: Field is Unchanged:  "Event"  Value:  "valueChanged"
[20200117 23:35:51.551 UTC] [qt.mqtt.connection] [debug]: qint32 QMqttConnection::sendControlPublish(const QMqttTopicName&, const QByteArray&, quint8, bool, const QMqttPublishProperties&) QMqttTopicName("OpenZWave/1/node/25/instance/1/commandclass/49/value/281475401138194/")  Size: 525  bytes. QoS: 0  Retain: true
[20200117 23:35:51.552 UTC] [qt.mqtt.connection.verbose] [debug]: bool QMqttConnection::writePacketToTransport(const QMqttControlPacket&)  DataSize: 599
[20200117 23:35:51.554 UTC] [qt.mqtt.connection.verbose] [debug]: void QMqttConnection::transportReadReady()
[20200117 23:35:51.554 UTC] [qt.mqtt.connection.verbose] [debug]: Received PUBLISH
[20200117 23:35:51.555 UTC] [qt.mqtt.connection.verbose] [debug]: Finalize PUBLISH: topic: QMqttTopicName("OpenZWave/1/node/25/instance/1/commandclass/49/value/281475401138194/")  payloadLength: 525
[20200117 23:36:06.122 UTC] [qt.mqtt.connection] [debug]: void QMqttConnection::transportError(QAbstractSocket::SocketError) QAbstractSocket::RemoteHostClosedError
[20200117 23:36:06.122 UTC] [ozw.mqtt.publisher] [debug]: MQTT State Change 0
[20200117 23:36:06.123 UTC] [ozw.mqtt.publisher] [warning]: Exiting on Failure
Fishwaldo commented 4 years ago

This is by design. There is a lot of state information that is stored in the broker via Retained messages and the only way to “restore” some of that information without some overly complex logic to handle reconnect etc is restarting the ozwdaemon.

You should probably set a docker policy to restart the ozwdaemon container automatically to handle this.

blaster452 commented 4 years ago

is it possible this happens because a username and password is required? Is it possible to add these to the initial build command?

floriskruisselbrink commented 4 years ago

Searching for a solution to my problem I stumbled upon this bugreport.

This has nothing to do with username/password. My mqtt broker has no username/password, but still I need to restart qt-openzwave after the connection to mqtt has been lost...

Unfortunately, automatically restarting ozwdaemon is not really possible in my environment as I have the ozwdaemon container running on a completely different host than most of my other stuff (mqtt, hass, etc.).

I'm not really sure what the current state of this issue is: not fixable, or (because the labels 'enhancement' and 'help wanted' were added) not fixable yet, but if there is a solution it might possibly get done?

kpine commented 4 years ago

Unfortunately, automatically restarting ozwdaemon is not really possible in my environment as I have the ozwdaemon container running on a completely different host than most of my other stuff (mqtt, hass, etc.).

@vloris Why does it matter where the container is running? A Docker restart policy of on-failure or always will keep restarting the container until it can reach the MQTT server again. There is a requirement that the container be up for 10 seconds in order to be restarted. In my testing that was true, but it could vary case by case. I kept the MQTT server down for several minutes, and after starting it again ozwd was eventually able to connect again.

If not Docker, systemd and other supervisors can restart containers on failure, with configurable restart timers, etc.

It is definitely undesirable for the entire Z-Wave network to be restarted for this situation though, so perhaps that is where the enhancement tag comes into play.

kpine commented 4 years ago

Just noticed you were asking about this in the HA forums. In response there was this interesting suggestion about bridging an mqtt broker that is local to the ozwd host to your main mqtt broker. As long as the local broker is running (which should be simpler to achieve), ozwd won't restart. In fact, the HA supervisded addon is already implementing this method, although all the brokers are local to the same host, but are in different containers.

floriskruisselbrink commented 4 years ago

I think I wrote it not clear enough. What I was trying to say is that I don’t want ozwdaemon to continuously try to restart until finally MQTT is available again. The ‘automation’ I was referring to was a way to only have ozwdaemon restart after the MQTT broker was online again.

The bridging suggestion on the HA forum sounds like a really good idea, I will try that soon.