zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.74k stars 6.56k forks source link

Sysbuild-configured project using `west flash --recover` will wrongly recover (and reset) the MCU each time it flashes an image #50421

Open nordicjm opened 2 years ago

nordicjm commented 2 years ago

Describe the bug Issue noticed whilst testing https://github.com/zephyrproject-rtos/zephyr/pull/49552 It seems that with a sysbuild project, with multiple images e.g. sample application and mcuboot, using west flash --recover will wrongly have the west flash runner recover the module for every instance it is used, which results in a non-working board. It also resets the board after each programming cycle, which can cause issues if e.g. readback protection is enabled or if an application needs something else to start, in which case the following programming cycles will fail. Output:

west flash --recover
WARNING: can't find the zephyr repository
 - no --zephyr-base given
 - ZEPHYR_BASE is unset
 - west config contains no zephyr.base setting
 - no manifest project has name or path "zephyr"

 If this isn't a Zephyr workspace, you can   silence this warning with something like this:
   west config zephyr.base not-using-zephyr
-- west flash: rebuilding
[0/9] Performing build step for 'test_mcuboot'
ninja: no work to do.
[1/9] Performing build step for 'swapped_app'
ninja: no work to do.
[3/9] Performing build step for 'mcuboot'
ninja: no work to do.
[9/9] Completed 'mcuboot'
-- west flash: using runner nrfjprog
-- runners.nrfjprog: mass erase requested
Using board 483128317
-- runners.nrfjprog: Recovering and erasing flash memory for both the network and application cores.
Recovering device. This operation might take 30s.
Writing image to disable ap protect.
Erasing user code and UICR flash areas.
Recovering device. This operation might take 30s.
Writing image to disable ap protect.
Erasing user code and UICR flash areas.
-- runners.nrfjprog: Flashing file: /tmp/bb/zephyr/twister-out/bl5340_dvk_cpuapp/tests/boot/test_mcuboot/boot.mcuboot/mcuboot/zephyr/z
ephyr.hex
Parsing image file.
Verifying programming.
Verified OK.
Applying pin reset.
-- runners.nrfjprog: Board with serial number 483128317 flashed successfully.
-- west flash: using runner nrfjprog
Using board 483128317
-- runners.nrfjprog: Recovering and erasing flash memory for both the network and application cores.
Recovering device. This operation might take 30s.
Writing image to disable ap protect.
Erasing user code and UICR flash areas.
Recovering device. This operation might take 30s.
Writing image to disable ap protect.
Erasing user code and UICR flash areas.
-- runners.nrfjprog: Flashing file: /tmp/bb/zephyr/twister-out/bl5340_dvk_cpuapp/tests/boot/test_mcuboot/boot.mcuboot/swapped_app/zeph
yr/zephyr.signed.hex
Parsing image file.
Verifying programming.
Verified OK.
Applying pin reset.
-- runners.nrfjprog: Board with serial number 483128317 flashed successfully.
-- west flash: using runner nrfjprog
Using board 483128317
-- runners.nrfjprog: Recovering and erasing flash memory for both the network and application cores.
Recovering device. This operation might take 30s.
Writing image to disable ap protect.
Erasing user code and UICR flash areas.
Recovering device. This operation might take 30s.
Writing image to disable ap protect.
Erasing user code and UICR flash areas.
-- runners.nrfjprog: Flashing file: /tmp/bb/zephyr/twister-out/bl5340_dvk_cpuapp/tests/boot/test_mcuboot/boot.mcuboot/test_mcuboot/zep
hyr/zephyr.signed.hex
Parsing image file.
Verifying programming.
Verified OK.
Applying pin reset.
-- runners.nrfjprog: Board with serial number 483128317 flashed successfully.

To Reproduce Build PR sample application for an nrf5340-based board and manually run west flash --recover in the build directory

Expected behavior One recovery - at the start of the programming process. One reset - after all images have been programmed.

Impact Showstopper

Environment (please complete the following information):

nordicjm commented 2 years ago

CC @tejlmand @mbolivar-nordic probably need to plan how this works in future

tejlmand commented 2 years ago

this is not a bug, but using runner specific arguments on multiple domains are considered experimental at current stage.

There was however a flaw in warning printed, that it was not printed until 2 or more arguments were used.

See #50422

tejlmand commented 2 years ago

closed, as this is experimental feature, and the issue regarding correct printing of the error message has been fixed here: #50422

nordicjm commented 1 year ago

Reopening this as this is becoming a big problem with sysbuild, there are 2 issues here:

  1. Sysbuild will reset after each image has been flashed, with the case of the nRF5340 which has application readback protection, this can mean that this security system is engaged during flashing of one image after which, no more images can be flashed and the whole flash process aborts with an error. One possible method of fixing this is to have sysbuild halt the CPU(s), flash all images that need to be flashed, and then do a reboot once all images have been flashed.
  2. Arguments passed to west flash in a sysbuild-configured project need a level of granularity, if a device ID is provided, should this be provided to all runners? This will work if sysbuild is working with 1 SoC on 1 board, but if there are 2 SoCs/boards then this is wrong and going to fail. The other issue is the --recover argument which is used to erase the contents of the flash, using the nRF5340 example, in a normal non-sysbuild project, this works as expected, but in a sysbuild project this supplies the argument to every image that is flashed, which means if you flash 3 images, it will recover, flash image 1, recover, flash image 2, recover, flash image 3 - leaving the board in a non-working state. So if this command is used, it should be used from a "global" perspective in sysbuild projects, recover all boards/cores when the script runs without running it once per flash invocation.
butok commented 4 months ago

in a sysbuild project this supplies the argument to every image that is flashed, which means if you flash 3 images, it will recover, flash image 1, recover, flash image 2, recover, flash image 3 - leaving the board in a non-working state.

Two years have passed. This is a very annoying issue. Have you found a solution?

nordicjm commented 4 months ago

in a sysbuild project this supplies the argument to every image that is flashed, which means if you flash 3 images, it will recover, flash image 1, recover, flash image 2, recover, flash image 3 - leaving the board in a non-working state.

Two years have passed. This is a very annoying issue. Have you found a solution?

It's been fixed for ages https://github.com/zephyrproject-rtos/zephyr/pull/69748

butok commented 4 months ago

in a sysbuild project this supplies the argument to every image that is flashed, which means if you flash 3 images, it will recover, flash image 1, recover, flash image 2, recover, flash image 3 - leaving the board in a non-working state.

Two years have passed. This is a very annoying issue. Have you found a solution?

It's been fixed for ages #69748

Great. Thank you for pointing to this feature. Now, I have the west flash erase issue with an MCUBoot sysbuild application, which still does unwanted repetitive erasing & reset. I will try to find a solution using the provided information.