FreeRTOS / coreMQTT

Client implementation of the MQTT 3.1.1 specification for embedded devices
MIT License
137 stars 97 forks source link

Handle QoS2 messages when the library loses state - store state in NVM #271

Open ademyankov opened 8 months ago

ademyankov commented 8 months ago

It is a very narow corner case but stil it breaks the client and renders it unusable.

  1. coreMQTT client is running on an embedded device, it subscribed to a topic with QoS2.
  2. The issue happens when a message is received and then the device reboots (for unrelated to coreMQTT reasons).
  3. When it boots again and re-connects to the broker it immidiately receives MQTT_PACKET_TYPE_PUBREL but since the device was rebooted the library has no knowleadge of the packet it received before, outgoingPublishRecords is empty and MQTT_ProcessLoop returns MQTTBadResponse.

Q: How would you suggest to handle a situation like this? There are no public functions in the library to get the failed packet id and to get access to outgoingPublishRecords structure that one could simply inject a record with id and MQTT_PACKET_TYPE_PUBCOMP type to let go that record. In the corner case like this I guess it's ok to lose the record since not much you can do if device reboots.

n9wxu commented 8 months ago

I have restated the failure case to try and be crystal clear on what is happening.

sender   receiver
publish ->
         <- pubrec
** REBOOT **
pubrel  ->
           ??????
pubrel ->
pubrel ->
AniruddhaKanhere commented 7 months ago

Hello @ademyankov,

Thank you for taking the time to report this bug to us.

Yes, you are absolutely correct. This bug will manifest itself when the device loses power mid QoS2 transaction.

To fix this, we believe that the state must be stored in persistent memory after a PUBREL is received. Whenever the device boots up, a callback would allow the application to restore the state of the MQTT 'system'. This way, when the broker sends a PUBREL back to us after reboot, the library can correctly handle it.

However, currently the library does not support it. Also, we would like if the storing and restoring the state is done by the application using callbacks from the library as the library is hardware agnostic.

We think that this can be achieved using setter and getter functions which would be called by the library whenever it receives an incoming PUBLISH/PUBACK/PUBREL/PUBREC/PUBCOMP and wants to store the state. This way, the application is free to choose how to implement the persistent storage so that it does not wear out same block of memory and provides fast response etc.

We also think the since this is very specific to the QoS2 publishes, we should separate the processing and handling of these packets such that anyone not using QoS2 should not have to worry about implementing the setter and getter functions.

Does the above sound like a solution to you?

For now we do not have any current plans to implement this as we have some exiting things to we are working on which we want to release - see here. Thus, I am marking this as a feature request and we'd be grateful to receive contributions from the community!

Thanks, Aniruddha

AniruddhaKanhere commented 7 months ago

@ademyankov I also modified the title of this issue just to reflect that this is now a feature request. If you would like, I am happy to revert it back to the way it was.

ademyankov commented 7 months ago

@AniruddhaKanhere yes, thank you, that sounds like a good solution.