thin-edge / thin-edge.io

The open edge framework for lightweight IoT devices
https://thin-edge.io
Apache License 2.0
219 stars 54 forks source link

mosquitto doesn't restart properly after a device restart when mTLS is enforced for MQTT #2945

Closed didier-wenzek closed 3 months ago

didier-wenzek commented 3 months ago

Describe the bug

Restarting the device using a command sent over MQTT to the agent, let the device in a bad state when mTLS has been configured for MQTT.

This is fixed by restarting mosquitto.

To Reproduce

  1. Configure the mosquitto instance to use certificates

    For example if you have set everything up correctly, then your tedge config list should look something like this (exact paths might be different though, depending on how you set it up)

    # $ tedge config list | grep mqtt.client
    mqtt.client.host=localhost
    mqtt.client.port=8883
    mqtt.client.auth.ca_file=/etc/mosquitto/ca_certificates/ca.crt
    mqtt.client.auth.cert_file=/setup/client.crt
    mqtt.client.auth.key_file=/setup/client.key
  2. Trigger a restart

$ tedge mqtt pub --retain te/device/main///cmd/restart/test-1 '{"status":"init"}'
  1. Wait for the device to restart
  2. Unexpectedly, tedge can no more reconnect over MQTT
$ tedge mqtt sub te/device/main///cmd/+/+ 
ERROR: I/O: Connection refused (os error 111)

Expected behavior

On reboot,

Screenshots

Environment (please complete the following information):

Additional context

I found no related mosquitto issue, only a related question:

reubenmiller commented 3 months ago

@didier-wenzek Are you sure this is not actually a problem due to the mosquitto broker trying to bind too early to a network which isn't ready yet? e.g. https://github.com/eclipse/mosquitto/issues/2878

There are already suggestions to change the systemd target to network-online.target instead of network.target.

didier-wenzek commented 3 months ago

Thanks! Indeed, this has been fixed by changing the mosquitto systemd target to network-online.target instead of network.target.