mcu-tools / mcuboot

Secure boot for 32-bit Microcontrollers!
Apache License 2.0
1.3k stars 663 forks source link

If the image header is corrupt, bootutil_img_hash() can take a VERY long time #2025

Open ironss-iotec opened 1 month ago

ironss-iotec commented 1 month ago

While testing repeated firmware updates, cycling between two versions, with a random power cycler, the system failed after about 3000 update operations.

The problem was in image_validate.c function bootutil_img_hash(), when the image header in flash was corrupt. The intention is that this will be detected by the checksum algorithm.

However, in this case, the field hdr->ih_img_size was a very large number, about 102 MB, rather than less than 1 MB (the size of the flash area). Instead of taking about 3 s to verify the checksum in this test system, it would have taken 5 minutes or more. In the worst case, if ih_img_size were all-1s (4 GB), it would take about 3.5 hours to verify the checksum.

A simple fix is to check that the size of the image being validated (and the protected TLVs, etc) is less than the size of the flash-area. This means that the checksum algorithm will take a reasonable duration (not more than the worst-case for that size flash area), rather than an arbitrary duration.

diff --git a/boot/bootutil/src/image_validate.c b/boot/bootutil/src/image_validate.c
index a697676b..e27f489e 100644
--- a/boot/bootutil/src/image_validate.c
+++ b/boot/bootutil/src/image_validate.c
@@ -117,6 +117,10 @@ bootutil_img_hash(struct enc_key_data *enc_state, int image_index,
     /* If protected TLVs are present they are also hashed. */
     size += hdr->ih_protect_tlv_size;

+    if (size > fap->fa_size) {
+        return BOOT_EBADIMAGE;
+    }
+
 #ifdef MCUBOOT_RAM_LOAD
     bootutil_sha_update(&sha_ctx,
                         (void*)(IMAGE_RAM_BASE + hdr->ih_load_addr),

MCUboot version: branch v2.1.0, git d4394c2f9b76e0a7b758441cea3a8ceb896f66c8

The image header became corrupt when the power-cycle happened while the application was writing the new version header to flash, but before the flash device completed the operation. The application does verify the new image version in flash before rebooting, but did not have time to do that before the power-cycle.

ironss-iotec commented 1 month ago

Rather than failing if size is too big, another option would be to set size to fap->fa_size (the size of the partition). This will still fail; it will take longer because it has to verify the checksum.

d3zd3z commented 1 month ago

I'm a little surprised this doesn't just fail with the first flash read past the end of the partition. Does the driver not reject the flash operations that are out of bound? I believe that was the assumption in how this was written.

ironss-iotec commented 1 month ago

It didn't even occur to me that the non-library flash-driver code should return a failure code for writing outside the flash-area. I assumed that the library code would do that.

It seems that I based my flash-area code on some of the examples where the out-of-bounds check is not explicit, done by some board-specific library code (cypress, mbed, zephyr), rather than those examples that do have explicit checks (esp32, nuttx)

It is easy enough to add the appropriate check to my implementation of flash_area_xxxx() operations and return an error for out-of-bounds access.

ironss-iotec commented 1 month ago

In fact, I am more surprised that boot_image_load_header() does not pick up the corrupt size: it has an explicit check for size being too big. I will need to do some more investigation.

UPDATE: Nothing in the library code calls boot_image_load_header().

I do like the idea of bootutil_img_validate() being the thing that looks for and finds for an invalid image, rather than relying on the flash driver, given that

Perhaps bootutil_img_validate() should call boot_is_header_valid()?

My provided patch is poor, in that it reads fap->fa_size directly, rather than calling flash_area_get_size().