HomeACcessoryKid / life-cycle-manager

Initial install, WiFi settings and over the air firmware upgrades for any esp-open-rtos repository on GitHub
Apache License 2.0
60 stars 11 forks source link
encrypted esp-open-rtos esp8266 firmware-update ota over-the-air secure wifi-settings

Life-Cycle-Manager (LCM)

Initial install, WiFi settings and over the air firmware upgrades for any esp-open-rtos repository on GitHub
(c) 2018-2024 HomeAccessoryKid

Update December 2023

It looks like GitHub has put a 10s timeout on their TLS stack.
When verifying the server certificate, we take >15s and the server finishes the connection.
Version 2.2.6 tries to fix this by using overclock during this phase.

Update season 16 April 2022

After 14 months, version 2.1.2 will get upgraded to version 2.2.5. So be aware your own app update will take extra long.
It would be recommendable to also update devices that do not have an user app update.
Not that 2.1.2 is broken or in danger, but 2.2.5 is more future proof.

Version

Changelog
With version 2.0.0 LCM has arrived to a new stage with its own adaptation of rboot - rboot4lcm - which counts powercycles. These are used to check updates, reset wifi, clear or set LCM_beta or factory reset. It also gives access to the emergency mode.
Setting a value for a led_pin visual feedback is possible. By having introduced the latest-pre-release concept in version 1.0.0, users (and LCM itself) can test new software before exposing it to production devices. See the 'How to use it' section.

https://github.com/HomeACcessoryKid/ota-demo has been upgraded to offer system-parameter editing features which allows for flexible testing of the LCM code.

Scope

This is a program that allows any simple repository based on esp-open-rtos on esp8266 to solve its life cycle tasks.

New features version 2

This new LCM code is able to load/update the bootloader from github.
The new bootloader is able to count the amount of short power cycles (<1.5s)
From the second cycle the cycles must be shorter than 4 seconds. Also a LED is lit if defined.
The boot loader conveys the count to the loaded code using the rtc.temp_rom value
User code is allowed the values from 1-4

If count > 4 the bootloader launches LCM otamain.bin in rom[1]

For these values the behaviour is controlled through the sysparam string ota_count_step.
The default value of 3 reduces the chance of user misscounting and triggering something else than intended or playfull children.

Note that with LCM_beta mode and wifi erased you can set any emergency fallback server to collect a new signed version of otaboot.bin. This is to prevent a lockout as witnessed when Github changed their webserver in 2020.
Tested with macOS builtin apache server.
By monitoring the output with the terminal command nc -kulnw0 45678 you have 10 seconds to see which action was chosen before it executes.

If ota_count_step=="3" (default)

If ota_count_step=="2"

If ota_count_step=="1"

Missing or other ota_count_step values will be interpreted as 3

In version 2.2.5 there are two new features: There is a new possibility for those user apps that need some configuration data to work that is specific to each instantiation. One can set the ota_string parameter which can be parsed by the user app to set e.g. MQTT server, user and password or whatever else you fancy. Since it is up to the user app to parse it, you test whatever works for you within the cgi transfer of parameters. Also, using the 'erase wifi' mode, new settings can be set again when needed.

There also exists the possibility to set the sysparam ota_count to activate the 'erase wifi' etc from the user app as well.

Non-typical solution

The solution is dedicated to a particular set of repositories and devices, which I consider is worth solving.

If all of the above would not be an issue, the typical solution would be to

In my opinion, for the target group, the typical solution doesn't work and so LCM will handle it. Also it turns out that there are no out-of-the-box solutions of the typical case out there so if you are fine with the limitations of LCM, just enjoy it... or roll your own.
(PS. the balance is much less black and white but you get the gist)
*) This feature is not yet implemented (it is quite hard), so 'cross your fingers'.

Benefits

Can I trust you?

If you feel you need 100% control, you can fork this repository, create your own private key and do the life cycle of the LCM yourself. But since the code of LCM is public, by audit it is unlikely that malicious events will happen. It is up to you. And if you have ideas how to improve on this subject, please share your ideas in the issue #1 that is open for this reason.

How to use it

User code preparation part

Now test your new code by using a device that you enroll to the pre-release versions (a checkbox in the wifi-setup page).

User device setup part

Creating a user app DigitalSignature

from the directory where make is run execute:

openssl sha384 -binary -out firmware/main.bin.sig firmware/main.bin
printf "%08x" `cat firmware/main.bin | wc -c`| xxd -r -p >>firmware/main.bin.sig

How it works

This design serves to read through the code base. The actual entry point of the process is the self-updater which is called ota-boot and which is flashed by serial cable.

Concepts

User app(0)
v.X triggers

The usercode Main app is running in bootslot 0 at version x. It can trigger a switch to bootslot 1.
Also the tuned bootloader rBoot4LCM can switch to bootslot 1.

powercycles select:

Based on the number of cycles, we will check for new versions, reset the wifi parameters or with lcmbeta allow the setting of an emergency server. Choosing factory reset will erase all the usercode and parameters so no sensitive data stays behind. After this the normal update cycle starts, except if an emergency server is defined

use http://not.github.com/somewhere/

After resetting wifi and selecting lcmbeta mode (12 power cycles) the user can specify another base location where the files otaboot.bin.sig and otaboot.bin will be collected. This enters emergency mode. If the signature is valid against the public key of LCM then it will replace the bootslot 0 and continue to update otamain etc.

(t)

This represents an exponential hold-off to prevent excesive hammering on the github servers. It resets at a power-cycle.

download certificate signature
certificate update?
Download Certificates

This is a file that contains the checksum of the sector containing three certificates/keys

Once downloaded, the signature is checked against the known public key and the sha384 checksum of the active sector is compared to the checksum in the signature file. If equal, we move on. If not, we download the updated sector file to the standby sector.

signature match?

From the sector containing up to date certificates the sha384 hash has been signed by the private key of LCM. Using the available public key, the validity is verified. From here, the files are intended to be downloaded with server certificate verification activated. If this fails, the server is marked as invalid.

new boot version?

This will download the latest version of rboot4lcm

new OTA version?
download OTA-boot➔0
update OTA-main➔1
sig & checksum OK?

We verify if there is an update of this OTA repo itself? If so, we use ota-boot to self update. After this we have the latest OTA code.

server valid?

If by checking the certificates the server is marked invalid, we return to the main app in boot slot 0 and we report by syslog to a server (to be determinded) so we learn that github has changed its certificate CA provider and HomeACessoryKid can issue a new certificate sector.
Now that the downloading from GitHub has been secured, we can trust whatever we download based on a checksum.

OTA-main(1) updates User app➔0
sig & checksum OK?

Using the baseURL info and the version as stored in sysparam area, the latest binary is found and downloaded if needed. If the checksum does not work out, we return to the OTA app start point considering we cannot run the old code anymore. But normally we boot the new code and the mission is done.

Note that switching from boot=slot1 to boot=slot0 does not require a reflash

AS-IS disclaimer and License

While I pride myself to make this software error free and backward compatible and otherwise perfect, this is the result of a hobby etc. etc. etc. So don't expect me to be responsible for anything...

See the LICENSE file for license information