Ribbit-Network / ribbit-network-frog-software

The software for the Ribbit Network Frog Sensor
https://www.ribbitnetwork.org/
MIT License
9 stars 8 forks source link

#33: Add block retry to OTA #36

Closed keenanjohnson closed 1 year ago

keenanjohnson commented 1 year ago

In #33 I documented a problem with my devices with slower cellular connections failing to receive an update.

The problem seems to be that the device receives an unexpected block from golioth server, which throws an unhandled exception and stops the update process.

This change adds some basic retry logic if there is an exception processing a block, which in testing, allows my devices to successfully update.

keenanjohnson commented 1 year ago

Here is a debug log showing the full update with this new behavior and coap debug enabled: screenlog.txt

keenanjohnson commented 1 year ago

@damz I still feel like this improvement could be a good idea as in the worst case (deep in my apartment) I've seen firmware updates take about 30 minutes. There don't seem to be any big downsides as far as I can tell? Waiting for golioth team to confirm the "correctness" of this behavior.

keenanjohnson commented 1 year ago

Discussed with @damz in discord that the retry should be done in the CoAP layer, not the higher-level OTA process as I have done here. It should be done only in the case where the block2 header is not what we expect and only once.

I will revise this PR as such.

keenanjohnson commented 1 year ago

Ok, I moved the location of the retry and added some logic to only retry once.

keenanjohnson commented 1 year ago

Thanks @damz ! When I was writing this, I was thinking: there probably is a more elegant way to express this. :)