InfiniTimeOrg / InfiniTime

Firmware for Pinetime smartwatch written in C++ and based on FreeRTOS
GNU General Public License v3.0
2.75k stars 941 forks source link

Bricking pinewatch using large size firmware for testing #1463

Closed AlexXZero closed 1 year ago

AlexXZero commented 1 year ago

Verification

What happened?

Brick pinewatch after testing feature (twice)

What should happen instead?

Linker error if build is not fit in available flash memory.

Reproduction steps

  1. Update code with adding feature so image size should use more than 95% of available internal flash.
  2. Flash image to the dev kit using DFU.
  3. Reboot the device to back into previous firmware

Then pinewatch start to be in constant reboot.

More details?

I tried to figure out what I'm doing wrong and found that InfinityTime uses magic internal flash addresses for saving some boot configuration, so I guess that rewriting this data is cause of this issue. I would suggest to do the following thing to avoid this issue for debug builds and for future InfinityTime releases which might start use more flash memory:

  1. Update linker script to have addresses of each magic variable in it.
  2. Update linker script with decreasing size of available internal flash memory size to avoid overlapping firmware and boot configuration flags.
  3. Update bootloader (not sure if it is possible to do for all users) to use last page for storing boot flags.
  4. Update documentation with flash memory map to let developers know which memory limitations they have.

I found at least one magic boot flags which is not present in the linker script and which might be overwritten by firmware:

FirmwareValidator::validBitAdress {0x7BFE8}; // Magic address!!!

So to avoid overlapping with this address available internal flash memory should be decreased to 0x5000 bytes!

 MEMORY
 {
-  FLASH (rx) : ORIGIN = 0x08020, LENGTH = 0x78000
+  FLASH (rx) : ORIGIN = 0x08000, LENGTH = 0x73000
   RAM (rwx) :  ORIGIN = 0x20000000, LENGTH = 64K
 }

Alternative option is somehow update bootloader to the latest page for this purposes, i.e. address 0x7FFFE8.

Note: I'm not sure if any other magic addresses are used somewhere else.

Version

https://github.com/AlexXZero/InfiniTime/releases/tag/1.11.0-mod3

Companion app

No response

AlexXZero commented 1 year ago

In a few days I'm going to add an instruction with photo's how I unbrick my dev kit. I remember that documentation of this procedure wasn't full enough for me (I did it using nrf52832 dev board) so I needed to find third party resources to make it clear for me.

AlexXZero commented 1 year ago

Just reviewed Zephyr flash memory map for the pinewatch and found that following configuration: Internal flash:

        boot_partition: partition@0 {
            label = "mcuboot";
            reg = <0x00000000 0x8000>;
        };
        slot0_partition: partition@8000 {
            label = "image-0";
            reg = <0x00008000 0x74000>;
        };
        scratch_partition: partition@7c000 {
            label = "image-scratch";
            reg = <0x0007c000 0xa000>;
        };

Spi flash:

        slot1_partition: partition@40000 {
            label = "image-1";
            reg = <0x00040000 0x74000>;
        };

So I guess we can't not use memory addresses 0x7c000-0x7fffff for firmware as they used by mcuboot as a scratch partition, but it should fine to use addresses up to 0x7BFE8 for the rest firmware code. (which also proved by size of partition on SPI flash) I.e. the proper linker script should have the following changes:

 MEMORY
 {
-  FLASH (rx) : ORIGIN = 0x08020, LENGTH = 0x78000
+  FLASH (rx) : ORIGIN = 0x08000, LENGTH = 0x74000
   RAM (rwx) :  ORIGIN = 0x20000000, LENGTH = 64K
 }
...
+    .boot_cfg : AT (0x7BFE8)
+    {
+      *(.boot*)
+    }
    } > FLASH

Also does anyone know what was the purpose of having 0x20 bytes offset from the begin of 0x8000 page?

Boteium commented 1 year ago

Wow, this sounds concerning. I've always fill up my firmware to more than 95% due to CJK glyph. I guess I'm lucky because I never tried to back to previous firmware.

AlexXZero commented 1 year ago

Some my findings for that last few days:

  1. a first 0x20 bytes placed on 0x8000 address are MCU image header (See mcuboot image format: https://github.com/mcu-tools/mcuboot/blob/v1.6.0/docs/design.md#image-format). So we should reserve space for them into linker script.
  2. Image trailer is placed at the end of image partition, it takes 1580 bytes (128*4*3=1536: swap status + 44 bytes: the rest fields).

Note: I used links to the MCU boot documentation corresponded to the version which is used by Infinitytime, i.e. 9015a5d404c2c688166cab81067be53c860d98f4 (v1.6.0 is closest):

For the bootloader to be able to determine the current state and what actions should be taken during the current boot operation, it uses metadata stored in the image flash areas. While swapping, some of this metadata is temporarily copied into and out of the scratch area.

This metadata is located at the end of the image flash areas, and is called an image trailer. An image trailer has the following structure:

     0                   1                   2                   3
     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    ~                                                               ~
    ~    Swap status (BOOT_MAX_IMG_SECTORS * min-write-size * 3)    ~
    ~                                                               ~
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                 Encryption key 0 (16 octets) [*]              |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                 Encryption key 1 (16 octets) [*]              |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                      Swap size (4 octets)                     |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |   Swap info   |           0xff padding (7 octets)             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |   Copy done   |           0xff padding (7 octets)             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |   Image OK    |           0xff padding (7 octets)             |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |                       MAGIC (16 octets)                       |
    |                                                               |
    +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

[*]: Only present if the encryption option is enabled (MCUBOOT_ENC_IMAGES).

The offset immediately following such a record represents the start of the next flash area.

Note: "min-write-size" is a property of the flash hardware. If the hardware allows individual bytes to be written at arbitrary addresses, then min-write-size is 1. If the hardware only allows writes at even addresses, then min-write-size is 2, and so on.

An image trailer contains the following fields:

  1. Swap status: A series of records which records the progress of an image swap. To swap entire images, data are swapped between the two image areas one or more sectors at a time, like this:
    • sector data in the primary slot is copied into scratch, then erased
    • sector data in the secondary slot is copied into the primary slot, then erased
    • sector data in scratch is copied into the secondary slot

As it swaps images, the bootloader updates the swap status field in a way that allows it to compute how far this swap operation has progressed for each sector. The swap status field can thus used to resume a swap operation if the bootloader is halted while a swap operation is ongoing and later reset. The BOOT_MAX_IMG_SECTORS value is the configurable maximum number of sectors mcuboot supports for each image; its value defaults to 128, but allows for either decreasing this size, to limit RAM usage, or to increase it in devices that have massive amounts of Flash or very small sized sectors and thus require a bigger configuration to allow for the handling of all slot's sectors.

  1. Encryption keys: key-encrypting keys (KEKs). These keys are needed for image encryption and decryption. See the encrypted images document for more information.

  2. Swap size: When beginning a new swap operation, the total size that needs to be swapped (based on the slot with largest image + TLVs) is written to this location for easier recovery in case of a reset while performing the swap.

  3. Swap info: A single byte which encodes the following information:

    • Swap type: Stored in bits 0-3. Indicating the type of swap operation in progress. When mcuboot resumes an interrupted swap, it uses this field to determine the type of operation to perform. This field contains one of the following values in the table below.
    • Image number: Stored in bits 4-7. It has always 0 value at single image boot. In case of multi image boot it indicates, which image was swapped when interrupt happened. The same scratch area is used during in case of all image swap operation. Therefore this field is used to determine which image the trailer belongs to if boot status is found on scratch area when the swap operation is resumed.
Name Value
BOOT_SWAP_TYPE_TEST 2
BOOT_SWAP_TYPE_PERM 3
BOOT_SWAP_TYPE_REVERT 4
  1. Copy done: A single byte indicating whether the image in this slot is complete (0x01=done; 0xff=not done).

  2. Image OK: A single byte indicating whether the image in this slot has been confirmed as good by the user (0x01=confirmed; 0xff=not confirmed).

  3. MAGIC: The following 16 bytes, written in host-byte-order:

    const uint32_t boot_img_magic[4] = {
        0xf395c277,
        0x7fefd260,
        0x0f505235,
        0x8079b62c,
    };

I'm not sure if I counted the Swap status correctly as I'm not sure which exactly values was used for the mcuboot which is placed in InfinityTime version. So I would suggest to add some spare place before, i.e we should limit the image size by address 0x73800, then the rest 0x800 bytes should be reserved for the mcuboot image trailer.

JF002 commented 1 year ago

Thanks for this bug report, @AlexXZero ! I think you've already figured most of the issue!

According to the memory map documented in the bootloader repo, the last 4KB cannot be used by the firmware because they are used by the bootloader as a scratch area (temporary data for the OTA).

The linker file must indeed take that into account!

The image trailer is also part of the firmware, so yes, I guess you're right : the linker cannot leave that space available for the code as it'll be overwritten by the trailer.

I'm not sure if I counted the Swap status correctly as I'm not sure which exactly values was used for the mcuboot which is placed in InfinityTime version. So I would suggest to add some spare place before, i.e we should limit the image size by address 0x73800, then the rest 0x800 bytes should be reserved for the mcuboot image trailer.

Additional bonus point if we can determine exactly how many bytes must be reserved for the image trailer. Mastering this to the byte will give us more confidence in our changes. Documenting this value will help future us when we'll need to change it ;-)

However, the valid bit @0x7BFE8 can (and must) be overwritten when you OTA a new firmware image. This bit is used to determine if the user has already validated the new version of the firmware. This bit is set when the user manually validate the firmware using the firmware validation feature in the setting. If this bit is preserved, the new firmware would not be invalidated and we would loose the automatic recovery in case of bad firmware.

Speaking of the recovery : when an invalid firmware is flashed to the pinetime, this firmware won't run properly, the valid bit won't be set and the bootloader should automatically revert to the previous version of the firmware. I'm a bit concerned by the fact that... it didn't seem to work in your case. I think it would be reaaaaaaaally interesting to understand what went wrong.

AlexXZero commented 1 year ago

@JF002 Thank you for your feedback. I agree that it would be the best to know which exactly memory is used, so I'm going to test it by the following test procedure:

  1. Erase whole internal flash memory
  2. Flash fresh bootloader at 0x0
  3. Flash app-mcuboot-v11.0 at 0x8000
  4. Make a flash dump using debugger
  5. Flash a little modified app-mcuboot-v11.0 via DFU (onto SPI flash)
  6. Attach the debugger during automatic reboot and swapping images, to stop it in the middle of the operation
  7. Make a flash dump to compare it with first one.

I will add test results to this ticket as soon as I do steps above.

In terms of bricking device, I believe that it was caused by overwriting image trailer, so MCU boot can't figure out what is going on (it might detect it as interrupted image swapping as "swap status" area was modified).

Update: After a quick review of memory map and it looks incorrect: 28+4+464+4=500KB instead of 512KB. I will check actual bootloader configuration.

Update2: I found the actually used by mcuboot memory map. So it looks like it match the following picture:

Internal nRF52 flash memory:
0x00000000: MCUBoot(28 kB)
0x00007000: Reboot logs (4 kB)
0x00008000: MCUBoot header (0x20 bytes)
0x00008020: Application (462+ kB)
0x0007bXXX: MCUBoot image trailer (1580 bytes)
0x0007c000: MCUBoot Scratch partition (4 kB)
0x0007d000: unused (12 kB)

SPI flash:
0x00000000: Bootloader Assets, like Boot Graphic (256 kB)
0x00040000: Application 2 (including MCUBoot header) (464 kB)
0x000b4000: User files - littlefs (3376 kB)

@JF002 I guess we need to update memory map to match actual addresses, I'm not sure if it is possible to update bootloader on all users devices to update scratch partition addresses and extending application partition on SPI memory. I guess it would be better to leave bootloader as it was initially done and mark last 12kB as reserved and we can reuse them later for something else.

AlexXZero commented 1 year ago

It was a little bit tricky to catch swapping operation in progress as it finish pretty quickly, but I managed to do it.

  1. I built two versions of firmware with minimal changes:
    5190,5192c5190,5192
    - 0x0014550 0fe8 1e43 4258 4158 4770 0000 f44f 23f6
    - 0x0014560 f8d3 3fe8 2b01 d003 4802 2101 f7fa be6a
    - 0x0014570 4770 bf00 bfe8 0007 f3bf 8f4f 4905 4b06
    ---
    + 0x0014550 0fe4 1e43 4258 4158 4770 0000 f44f 23f6
    + 0x0014560 f8d3 3fe4 2b01 d003 4802 2101 f7fa be6a
    + 0x0014570 4770 bf00 bfe4 0007 f3bf 8f4f 4905 4b06
    20659,20661c20659,20661
    - 0x0050c90 2372 2520 0073 3530 343a 3a33 3534 4400
    - 0x0050ca0 6365 2020 2031 3032 3232 6500 3238 6333
    - 0x0050cb0 6538 0064 3f3f 003f 3823 3830 3830 2030
    ---
    + 0x0050c90 2372 2520 0073 3530 353a 3a35 3634 4400
    + 0x0050ca0 6365 2020 2031 3032 3232 3500 3962 3366
    + 0x0050cb0 3333 0061 3f3f 003f 3823 3830 3830 2030
    25490,25492c25490,25492
    - 0x0066290 6907 0028 0010 0020 50a3 2e31 fc7c 57bd
    - 0x00662a0 ef39 786e 38c0 3af1 ddaa 3229 5bd6 410b
    - 0x00662b0 3d20 839e 57db 18aa                    
    ---
    + 0x0066290 6907 0028 0010 0020 6e73 0de2 2e29 e839
    + 0x00662a0 9156 6f63 510a 1ccf ce7d 05a6 0e0c 817f
    + 0x00662b0 cb88 2cb8 1ffc dd91           

After reviewing difference between original flash dump and flash memory during swapping I found three places which was changed:

  1. End of the firmware (it is expected as firmware have the same difference) image
  2. Image trailer (swap status part) image I managed to repeat the test to figure out in which direction trailer usage is growing up: image
  3. End of image trailer + scratch partition which is placed at 0x7c000 (not the last page) image

Flash dumps: pinetime_flash_dumps.tgz

AlexXZero commented 1 year ago

I'm not sure if I counted the Swap status correctly as I'm not sure which exactly values was used for the mcuboot which is placed in InfinityTime version. So I would suggest to add some spare place before, i.e we should limit the image size by address 0x73800, then the rest 0x800 bytes should be reserved for the mcuboot image trailer.

Additional bonus point if we can determine exactly how many bytes must be reserved for the image trailer. Mastering this to the byte will give us more confidence in our changes. Documenting this value will help future us when we'll need to change it ;-)

It looks like my previous math was wrong as I used min-write-size=4, but in fact mcuboot uses value 1 as it is possible to write the flash memory using separated bytes with leaving the rest bytes having 0xff value. So the correct size of swap status is 128*1*3=384 bytes. Then Image trailer size is 384 (swap status)+4 (swap size)+4 (padding to 8)+8 (swap info)+8 (copy done)+8 (image ok)+16 (MAGIC) = 432 bytes. So it should starts at 0x7c000-432=0x7BE50. Which matches the value I got in the tests.

Now when I figure out what exactly was wrong and how to fix it I will prepare a PR with the fix of the linker script.

Draft version:

/****************************************************************
 * Memory map configuration for using application with MCU-boot *
 ****************************************************************/
/*
 * Internal nRF52 flash memory:
 * 0x00000000: MCUBoot(28 kB)
 * 0x00007000: Reboot logs (4 kB)
 * 0x00008000: MCUBoot header (0x20 bytes)
 * 0x00008020: Application (463+ kB)
 * 0x0007be50: MCUBoot image trailer (432 bytes)
 * 0x0007c000: MCUBoot Scratch partition (4 kB)
 * 0x0007d000: unused (12 kB)
 *
 * SPI flash:
 * 0x00000000: Bootloader Assets, like Boot Graphic (256 kB)
 * 0x00040000: Application 2 (including MCUBoot header) (464 kB)
 * 0x000b4000: User files - littlefs (3376 kB)
 */

MCUBOOT_OFFSET = 0x0;
MCUBOOT_SIZE = 0x8000;
MCUBOOT_APP_IMAGE_HEADER_SIZE = 0x20;
MCUBOOT_APP_IMAGE_TRAILER_SIZE = 432;
APP_OFFSET = MCUBOOT_OFFSET + MCUBOOT_SIZE + MCUBOOT_APP_IMAGE_HEADER_SIZE;
APP_SIZE = 464K - MCUBOOT_APP_IMAGE_HEADER_SIZE - MCUBOOT_APP_IMAGE_TRAILER_SIZE;
SCRATCH_OFFSET = 0x7c000;
SCRATCH_SIZE = 4K;

MEMORY
{
  MCUBOOT (r) : ORIGIN = MCUBOOT_OFFSET, LENGTH = MCUBOOT_SIZE
  FLASH (rx) : ORIGIN = APP_OFFSET, LENGTH = APP_SIZE
  SCRATCH (r) : ORIGIN = SCRATCH_OFFSET, LENGTH = SCRATCH_SIZE
  UNUSED (r) : ORIGIN = SCRATCH_OFFSET + SCRATCH_SIZE, LENGTH = 12K
  RAM (rwx) :  ORIGIN = 0x20000000, LENGTH = 64K
}
Riksu9000 commented 1 year ago

@AlexXZero Is this fixed?

AlexXZero commented 1 year ago

@Riksu9000 Yes it was fixed, thanks