homieiot / convention

🏡 The Homie Convention: a lightweight MQTT convention for the IoT
https://homieiot.github.io/
Other
705 stars 59 forks source link

QoS of non-retained messages #247

Open Tieske opened 1 year ago

Tieske commented 1 year ago

See discussion here: 4a6ea00 (#200) and documentation here. The argument for not using QoS 2 (Exactly once) is that not all devices have persistent storage available.

But imo that argument is moot because QoS 1 (minimum once) delivery also requires persistent storage. The paragrpah on retry (4.4) specifically mentions resending and hence storage requirements:

When a Client reconnects with CleanSession set to 0, both the Client and Server MUST re-send any unacknowledged PUBLISH Packets (where QoS > 0) and PUBREL Packets using their original Packet Identifiers

The second argument: 4a6ea00 (#200) that the $state message might arrive before the value of a property.

If the property is retained; then we do not care, because all is QoS 1 and order is preserved.

If the property is non-retained; then it's an event, in which case the property doesn't have a "value" or a "state". It is merely a notification. So the controller not having received the value before the device reaches ready state, is a normal operating condition.

This essentially is a race-condition only if at the very moment of switching to ready an actual event happens at the device end.

On controller side if;

So combining those; my impression is that QoS 2 (Exactly once) is just fine for non-retained properties.

The above is based on my knowledge so far and reading up on the actual QoS flows.

Or did I miss something? Can you verify @Thalhammer ?

schaze commented 1 year ago

For the first argument:

But imo that argument is moot because QoS 1 (minimum once) delivery also requires persistent storage.

Not sure if it needs to be persistent storage but at least in memory for the duration of the network connection (see chapter 4.1 Storing state - http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc398718105)

I think QoS 2 should be used everywhere, otherwise there could be scenarios where you might get multiple partial repetitions of messages:

see chapter 4.6 http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc398718105 -->

"The rules listed above ensure that when a stream of messages is published and subscribed to with QoS 1, the final copy of each message received by the subscribers will be in the order that they were originally published in, but the possibility of message duplication could result in a re-send of an earlier message being received after one of its successor messages. For example a publisher might send messages in the order 1,2,3,4 and the subscriber might receive them in the order 1,2,3,2,3,4.")

This would be a nightmare to handle, e.g. lights on, off, on, off, on.

2nd argument:

If the property is retained; then we do not care, because all is QoS 1 and order is preserved.

The order is only guaranteed/required for messages for a single topic not all messages overall. (See chapter 4.6 Message ordering on http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html#_Toc398718105). This means there might be broker implementations that do not deliver messages in the order they are sent out (for different topics). In such cases a controller might miss the first emissions after the state switches to ready if they arrive before. I also do believe that this is much more relevant for structural changes in devices. (e.g. adding a node dynamically after a config change in the device requiring a "rediscovery").

Tieske commented 1 year ago

Device: non-retained property values MUST be published with QoS=0. Since they are events. So they are time bound. QoS=0 will NOT queue messages for disconnected clients. Using QoS=1 or QoS=2 for those would cause the broker to queue the messages and deliver them at a later point in time (to disconnected clients), which is not the right thing to do This explains this sentence in the spec;

to ensure that events don't arrive late or multiple times.

Controller: non-retained commands should also not be queued. If the command is "brew-coffee", you need it now. You do not want the machine to restore its connection 45 mins from now, and then suddenly start acting on an old command.

I both cases the delivery of the event/command is less reliable. But that is only when a transmission gets interrupted. Which is unlikely. The only way around this I can see, is if the message format is changed into a complex structure where a "timeout" or "validity period" can be added along with the actual payload. Such that the receiver can judge if the message is still valid.

As for the order of receiving; $state and property topic beings different, order cannot be guaranteed then, independent of QoS being the same or not. So I think;

Tieske commented 1 year ago

I've given this some more thought. My conclusion;

I have however also come to realize that the non-retained topics are weird in general.