Closed egnor closed 1 year ago
@egnor Thanks for reporting it. I will fixed it.
@egnor Could you please try the pacth whether can solve the issue you met. 0001-mqtt5-Fix-flow-control-will-regard-the-DUP-packet-an.patch
@ESP-YJM Yes, that seems to work, at least in my testing!
Note, I can spin up a PR to implement my recommended fix (see below) if that's helpful.
Repro case
skill -STOP mosquitto
, or whatever)skill -CONT mosquitto
)This is just one example case, there are others (see analysis below).
While the server is frozen, this kind of message shows up
In the failure state, debug messages look like this
Analysis
As per the MQTT5 spec, the MQTT5 client implements flow control. When a QOS=1 or QOS=2 PUBLISH is sent,
esp_mqtt5_flow_control()
incrementsclient->send_publish_packet_count
: https://github.com/espressif/esp-mqtt/blob/dffabb067fb3c39f486033d2e47eb4b1416f0c82/mqtt5_client.c#L19When a PUBACK is received,
esp_mqtt5_parse_puback()
decrements the counter: https://github.com/espressif/esp-mqtt/blob/dffabb067fb3c39f486033d2e47eb4b1416f0c82/mqtt5_client.c#L45This counter is checked in `esp_mqtt5_client_publish_check(): https://github.com/espressif/esp-mqtt/blob/dffabb067fb3c39f486033d2e47eb4b1416f0c82/mqtt5_client.c#L294
This is a bit fragile because any corner case that fails to increment or decrement appropriately can lead to an out-of-sync, stuck counter that's permanently too high... so it needs to be carefully checked to ensure that ALL transmissions of PUBLISH and receptions of PUBACK are logged correctly. In this case, I see at least three defects:
esp_mqtt5_parse_puback()
is only called frommqtt_process_receive()
ifis_valid_mqtt_msg()
is true for the message-ID in question. However, the counter should be decremented for any PUBACK, even if, say, the original message has been deleted due to expirationesp_mqtt5_client_publish_check()
isn't called for DUP retransmissionsRecommendation?
Split out the increment/decrement logic from other code, since it needs to be called very specifically:
esp_mqtt5_flow_control()
toesp_mqtt5_track_outgoing_publish()
, make sure it's called as close as possible to where a PUBLISH would be sent (I think this is good)esp_mqtt5_parse_puback()
into a newesp_mqtt5_track_incoming_puback()
, call it directly when a PUBACK is received before the message ID validity checkesp_mqtt5_client_publish_check()