qca / boardfarm

Main repo moved to:
https://github.com/mattsm/boardfarm/
BSD 3-Clause Clear License
70 stars 54 forks source link

What is the recovery step if flashing fails? #36

Open iniesC opened 8 years ago

iniesC commented 8 years ago

I see in the code that we retry 3 times before we throw an exception during flashing. What will be state of the router when flashing fails?

wwahammy commented 8 years ago

As I understand it, the router OS could be in an unusable state; I could be wrong though. As long as they didn't try to flash uBoot, they should be able to flash another image and get back to a working state. If they flashed uBoot though and that's where the problem is, it's possible the system won't be flashable via bft. In that case, they'd have to use something far more ambitious like JTAG.

The quick answer is as long as uBoot is still working, they should be able to flash new images. As a recommendation, I'd avoid flashing uBoot unless you're absolutely sure you know what you're doing.

@mattsm and @mbanders: anything else you can add to this?

iniesC commented 8 years ago

I am flashing a meta image and want this to be in the automation framework that gets used after every build to verify the image being flashed and be able to recover to a golden image if flashing fails.

wwahammy commented 8 years ago

I think that's feasibly something we could look into adding as an option.

iniesC commented 8 years ago

But the question will a software solution work if the board is toasted with the new flashed build? Or do we require a JTAG solution?

wwahammy commented 8 years ago

The question is really whether the image is flashing the bootloader, which is usually uBoot. bft flashes through uBoot. If uBoot isn't being flashed and overwritten, then a bad flash or bad image shouldn't make the board unflashable. After all, you can run bft again and can still get back into uBoot and flash again. JTAG should only be required if uBoot has been flashed and no longer works for some reason.

mattsm commented 8 years ago

The META you refer too usually contains u-boot and other pre-loader stuff which could in theory brick your board. If you exact and just flash the APSS (kernel + rootfs in META terminology) you should never brick the board.

I would just flash the output from the OpenWrt/QSDK build and use that instead of a full META for testing.

iniesC commented 8 years ago

Our build infrastructure only builds one whole meta image as of now. So that will potentially brick the board. Any support for JTAG in BFT?

mattsm commented 8 years ago

We might have some group that has done it, but it's highly windows based and never fully supported. Plus the JTAG devices are expensive for large automation.

You can extract the kernel and rootfs from a META though.

-M

iniesC commented 8 years ago

What if we make changes to uboot in the new build? That will not be tested with the new kernel and rootfs when we extract and flash only certain parts of the image. I believe that will impact the credibility of the test being run after partial flash. That is the reason I am skeptical about this approach.

mattsm commented 8 years ago

I suggest testing u-boot changes on a single station manually until you have a big reason to do otherwise.

wwahammy commented 8 years ago

@iniesC since it's more likely that people would make changes to their kernel and rootfs and due to the difficulty with recovering from a bad uBoot flash, we don't really have a mechanism to automatically recover. I'm sure we'd accept it as a feature request if someone wanted it badly and was willing to work through the difficulties though.