aws-samples / aws-iot-create-ota-update-deconstructed

Demonstrates how to enrich AWS IoT OTA updates with advanced job configuration features such as retries, scheduling, recurring maintenance windows and Software Package Catalog.
MIT No Attribution
5 stars 0 forks source link

No self-test ? #1

Closed TheCrazyKing closed 7 months ago

TheCrazyKing commented 7 months ago

Hi,

Thank you for the example. As far as I know, you did not include any self-test procedure here. I would have used a self-test kind of job after the first OTA-like job completes. Especially when using the software catalog. Can you provide point us to documentation about the software catalog update outside of a FUOTA job ?

Cheers

gregbreen commented 7 months ago

Hi. This repository only implements the cloud side. Specifically showing how you can use other APIs to create an OTA job that is equivalent to the CreateOTAUpdate API. The device receiving the job and job document can't tell the difference. It should process OTA in the normal way. Self-test is implemented on the device and is not in scope for this repo. You can see it baked into the OTA library: https://github.com/aws/ota-for-aws-iot-embedded-sdk/blob/main/source/include/ota.h#L171

Can you provide point us to documentation about the software catalog update outside of a FUOTA job ?

Apologies, but I'm not sure I'm understanding what you're asking for here. From the device perspective, the most important thing is to use the reserved named shadow to publish the new version (after it has been applied on the device): https://docs.aws.amazon.com/iot/latest/developerguide/preparing-to-use-software-package-catalog.html#reserved-named-shadow. So the catalog updates: https://docs.aws.amazon.com/iot/latest/developerguide/preparing-fleet-indexing.html#shadow-as-data-source. You can publish on this shadow as an extra step on top of the OTA implementation.

outside of a FUOTA job

Just on this, AWS IoT OTA is just a job with an opinionated job document, and with the OTA library available to help you implement that in the device firmware. You can implement your own FUOTA directly on top of jobs, and have your own job document structure. In that case, you wouldn't need this repo. This repo only helps if you would use our opinionated OTA, but want to access a few features that CreateJob supports and CreateOTAUpdate does not.

TheCrazyKing commented 7 months ago

Thanks for the answer. According to my research, it seems the self-test procedure is triggered by the cloud (via a StartSelfTest request, once the device is ready). More generally, as I understand it, the cloud initiates other events such as the command to close and verify the image, the image activation (OtaJobEventActivate) and the self-test. The FreeRTOS OTA Job performs more than just sending a file and reporting the result.

PS: It would be interesting if AWS could provide us with a state machine of how exactly a FUOTA Job triggers the events and so on depending on the device's response :)

gregbreen commented 7 months ago

as I understand it, the cloud initiates other events such as the command to close and verify the image, the image activation (OtaJobEventActivate) and the self-test

That's not the case. The only messages between the cloud and the device are:

  1. The job MQTT topics: https://docs.aws.amazon.com/iot/latest/developerguide/jobs-mqtt-api.html
  2. File transfer: MQTT file stream (https://docs.aws.amazon.com/iot/latest/developerguide/mqtt-based-file-delivery.html) or HTTP download from S3.

So that's the entire contract between the cloud and the device. And this repo delivers the cloud part of it (for the select situation of wanting to use our opinionated OTA job document, but access some features not available through the CreateOTAUpdate API).

Everything else you see in the OTA library is interaction between the library and the application firmware that uses it. For example, the self-test events are just between the library and your firmware. Here's a reference implementation on an STM32: https://github.com/FreeRTOS/iot-reference-stm32u5/blob/d00576bce97c60f9a0e8c63bc92df981fb7fad78/Common/app/ota/ota_update_task.c#L759-L802. Self-test is optional, depending on the needs of your device, and here you can see that self-test is skipped.

I appreciate however that it's confusing, and some of the names in the firmware perhaps suggest something else. There has also been multiple incarnations over time, and some deprecated documentation is still public. Also the OTA library in the C SDK was recently deprecated (as stated in the README). The latest advice now is "composable" OTA:

Which essentially means, don't use the OTA library, but build your firmware on top of the jobs library instead. And use the MQTT file streams library above if you want to transfer the file using MQTT instead of HTTP. For the device-side implementation, I advise you to focus on these 4 links.

TheCrazyKing commented 7 months ago

Hi,

Thanks. If I may, while it's true that the only messages sent by the cloud for a FUOTA are Job documents and a stream, what appears to me is that the job document sent by the cloud is not stateless. As such, after the image is downloaded, the job document from the cloud has a state of self test pending. This can be quickly seen in the old OTA library version here, where the error says "The job is in the self-test state while the platform is not".

I would be happy to continue this discussion on the FreeRTOS forum directly if needed.

gregbreen commented 7 months ago

OK, I understand the confusion, and we need to dive a little deeper. The job document created when the job is created doesn't change. However, jobs has a concept of statusDetails that allow for state to be saved. This is an additional field of key-value pairs that can exist in the job execution data.

https://docs.aws.amazon.com/iot/latest/developerguide/jobs-mqtt-https-api.html#jobs-mqtt-job-execution-data

The OTA library saves some state information to there. The device (and OTA library) use the UpdateJobExecution API (topic $aws/things/thingName/jobs/jobId/update) to do that.

https://github.com/aws/ota-for-aws-iot-embedded-sdk/blob/main/source/ota_mqtt.c#L667 https://github.com/aws/ota-for-aws-iot-embedded-sdk/blob/main/source/ota_mqtt.c#L1001

In regards to the original question that sparked all this, this repo already does everything that's needed to support that and to support the OTA library more generally. That's because statusDetails is baked into the jobs feature. There's nothing additional this repo needs to do to support self test.

TheCrazyKing commented 7 months ago

Aah you mean, only the device is responsible for updating the job's status details ? I though the OTA job kind of updated this field according to the device status, but it's true that the device itself can perform this status update directly ...

As a conclusion, the self-test is still triggered by the cloud, but this procedure is handled by the device. Thank you !!

gregbreen commented 7 months ago

Essentially, yes. Self test is not explicitly triggered by the cloud though. The device goes into self-test, if it wishes, after the file streaming has ended.