SuperHouse / esp-open-rtos

Open source FreeRTOS-based ESP8266 software framework
BSD 3-Clause "New" or "Revised" License
1.53k stars 491 forks source link

Universal OTA approach -> real production grade code #551

Open HomeACcessoryKid opened 6 years ago

HomeACcessoryKid commented 6 years ago

Edit: Please know that my code has reached 1.0.0 release under the name of https://github.com/HomeACcessoryKid/life-cycle-manager


I believe to have designed an approach to OTA that is universal, robust and production ready. Considering it is design only for now (yes I did make proof of concepts to check it is not nonsense), I would like feedback before I start coding all of it (which I will anyway).

One thing I want to say already is that I choose this two step design because I see many repositories that simply cannot combine the SSL based download scheme with their native functionality and fit within RAM and Flash. So, having said that, read on...

One comment I am looking for, is how you would like it to end up in the main tree of esp-open-rtos? As an extra, in the bootloader tree or ...?

You can also leave comments at https://github.com/HomeACcessoryKid/ota

Thanks in Advance...

HomeACcessoryKid commented 6 years ago

for those interested to look at the result so far, it is quite useable but still not well documented yet. (https://github.com/HomeACcessoryKid/ota) (https://github.com/HomeACcessoryKid/ota-demo) you should start with flashing the ota repository, and then observe what happens... it all bootstraps out of that... it can take a few minutes to complete the first time so be patient

feedback appreciated

ourairquality commented 6 years ago

Couldn't the firmware just be signed, avoiding the need for SSL? Could have a version number that is checked too, to stop an attacker downgrading the firmware.

HomeACcessoryKid commented 6 years ago

Of course anyone could do that even today, but it opens a whole lot of questions how to make this accessible to many people that cannot deliver a satisfactory setup for those either because they lack infrastructure to setup a web server, or they are not top-grade-coders or ...

In my approach, the web hosting - world wide - is delivered by GitHub so if the repository becomes popular, it scales.

Also, key management is a hassle, so my approach takes care that the GitHub certificates protect the payload, and the ota program protects those certificates. When setting up a 'release' in GitHub it takes care that only the owners of the repository can set up the files in it.

So, yes, your approach could work (but why didn't anyone do it then?), but it lacks the architecture of a plan that can work for the entire esp-open-rtos population without any investment... Considering that scope, I would appreciate further review and thanks for your initial one already.

Thanks, HacK PS. downgrading is not necessarily a bad thing and I made it possible in my code on purpose. PPS. The way I am building my suite of production grade HomeKit accessories based on off the shelf hardware means that I need to be concerned about delivering to many devices...

ourairquality commented 6 years ago

Does github allow download via http, or only https now?

Allowing downgrading might be bad when just using signatures, because an attacker might be able to trick the device into downloading to a old public signed firmware that had a vulnerability that had already been fixed, and then exploit that vulnerability.

It doesn't seem that much harder to sign releases than to manage SSL certificates, and seems worth considering given the significant reduction in code complexity. It might also support a range of distribution methods including upload from a signed file via the wificfg interface etc.

Personally I've avoided OTA updates to avoid the loss of security, but people seem to want it, it seems an essential feature, so I might need to implement it and shall explore the options.

I use signed data for upload. A server can verify the source to ensure the integrity of the system. The key for this can be set via the wificfg interface, stored in the sysparam area. I presume the same could be done for a firmware key, to allow people to switch the firmware sources to one based on a different key, or it could be hard coded in the firmware but that would locked it to the one source.

HomeACcessoryKid commented 6 years ago

GitHub is https only nowadays.

It is true that more and more devices require OTA in order to be able to patch security and feature upgrades. My devices are often build in, connected to mains and there are many. Having to solder for each update would stop one from doing it and would create that many more devices in the world that can be hacked. I'm happy to think along in your approach of the subject.

The subject of scalability and low complexity is important for me and for that reason alone, I will stick with my approach. My users won't have to keep any keys, only add like 5 lines of code to whatever they have and they are in business.

The way of downgrading is because the author withdraws a version in GitHub. The device cannot influence that so short of hacking that GitHub account, there is no attack surface. You have me half convinced that approaching such a scenario can also be covered by publishing the old code with a higher version is also OK. I'll sleep a few more nights over that...

Note that in my approach, the author of the code still adds a SHA384 hash and length file that is verified before the code is accepted. I think that extending my OTA program to optionally also accept a public key from the user-program author (uploaded at wifi-cfg time) is a good improvement. It maintains the low complexity option. Consider that one added to the todo-list.

kanflo commented 6 years ago

I like the idea of a stable and secure OTA solution for EOR. One important part of the flow chart is the "toll gate" where the newly upgraded firmware is considered functional and the slot index is changed. Before this TG is passed, a system reboot when running the new firmware will cause the old firmware to take over preventing bricking the device.

HomeACcessoryKid commented 6 years ago

I have the intention to rename 'ota' to 'life cycle manager' To answer several (if not all of your questions) please check the README at https://github.com/HomeACcessoryKid/life-cycle-manager

Thanks for showing interest, HacK

HomeACcessoryKid commented 6 years ago

To add a bit more detail to the TollGate question raised by @kanflo I want to indicate the subject has been part of the design. Due to the nature of the two step update, I had to assure a fallback mechanism that results to the same effect. In this case the first assumption is that the new user software has been tested by the user software supplier to at least perform a new firmware update if a certain TollGate criterium is not met. On top of that, the user firmware is checked by a hash which will prevent a corrupted firmware to be considered as final. As a final step (still to be added to the current code) the idea is to not write the first byte (or few bytes) of the user firmware to flash until after the verification of the hash. This means that the boot loader will consider the user code as broken for sure and will run the updater again where the software can be loaded again without mistakes. This will even work if the original transmission mistake happens in a part of the code that the boot loader cannot detect.

I hope that - given the scope of this solution - this will be sufficient. Thanks for your feedback, HacK

HomeACcessoryKid commented 5 years ago

Please know that my code has reached 1.0.0 release under the name of https://github.com/HomeACcessoryKid/life-cycle-manager

It has been deployed over 4000 times already. Check it out!