Open SundaresanN opened 6 months ago
Hey, this is pretty interesting!
I was naively not aware of the fact that this exploit also applies to later generations of the STM32F series. I will buy a couple of F2, F3 and F4 devboards and test it out. I've only skimmed through the repo but by the looks of it the target exploit firmware code appears to be virtually identical to that of the F1, leading me to believe the current attack board firmware could work with little to no modification.
For a general direction I recommend starting out by running the current dump.py script and see how far you can already get with that.
I've attempted to execute the exploit on a STM32F401 Blackpill with no success. The vulnerability does get to stage 1, however, after after the reset is applied, the F4 still refuses flash access. I suspect that the F4's readout protection triggered by SRAM execution can no longer simply be removed by a soft reset. Instead, it likely also requires a power cycle (which unfortunately would reset the FPB). With RDP=0, I did get it to work, indicating that my code is likely not the culprit.
I'm also not sure how the person in the mentioned repo achieved executing the vulnerability on their F4. The SRAM execution entry point of the F1 is still being used in their attack. That's rather odd, since the F4's SRAM execution no longer uses these quirky SRAM entry points. My F4 also throws a hard fault interrupt after the power glitch has been applied, which serves as my entry point for stage 1. Their code does not seem to take this into account at all.
Perhaps my F4 is a newer revision, or maybe its a clone?
I'm going to try removing all power caps from my blackpill board as a last desperate attempt at getting this to work. If that doesn't do it, I'll try it on a F4 discovery board. I'll also try to see if this works for an F2 I have lying around.
Any progress related to this can be found under the f4-testing
branch
Hello, I recently discovered this project and conducted some further in-depth research on it. Here's what I tested on my stm32f412 development board:
This indicates that the stm32f4 has some level of defense against this vulnerability. Interestingly, if step 2 had not entered SRAM startup mode but system bootloader startup mode instead, then resetting the CPU via rst would allow normal execution from user code. It seems that the stm32f4 has specifically guarded against issues with SRAM startup shellcode. The same exploit does not apply to the stm32f4.
However, approaching from the bootloader might be a direction to consider. I haven't conducted further tests for now, but from the analysis of the bootloader code (0x1fff0000 - 0x1fff5fff), the bootloader is filled with numerous checks and defenses, making it extremely difficult to construct arbitrary code execution within the bootloader.
I see, that's interesting.
So what you're saying is if we find a way to patch the FPB from within the bootloader, the reset would be able to jump into the shell code!
I know there have been a couple of projects that reverse engineered the bootloader code for vulnerabilities. Namely, for glitching attacks such as https://blog.kraken.com/product/security/kraken-identifies-critical-flaw-in-trezor-hardware-wallets, which glitch the flash read command of the USART bootloader to bypass the RDP check. Their hardware set-up to pull this off utilizes an pretty fancy FPGA, and even with that, the vulnerability needs to be brute forced for a pretty long time. I've considered attempting glitching attacks as such on the Pi Pico, but whether is is even possible is yet to be determined.
I'll also try to take a closer look at the the bootloader ROM once I have the time, and see If I can spot anything interesting.
Feel free to keep us updated with your progress!
Yes, that's exactly what I'm getting at - aside from booting from SRAM, there might be another path to success. At least, that's how the hardware seems to behave.
I'm actually trying to see if it's possible to cause a fault in the bootloader's RDP (Read-out Protection) check through voltage glitching.
In fact, I have successfully induced a fault and bypassed the RDP check for a single command. However, because the bootloader checks the RDP with every command execution and has strict memory address restrictions (only reading/writing/executing from flash 0x08000000 to flash end, SRAM 0x20003000 - 0x20040000, the bootloader's own code range 0x1fff0000-0x1fffffff, and option bytes), constructing a reliable and stable Arbitrary Code Execution (ACE) is challenging. From my observations, it would take at least two successful glitch injections to bypass the RDP check and construct an ACE.
Worse, when RDP1 is enabled, the bootloader has a few lines of code that clear everything from 0x20003000 - 0x20040000 to 0 on every reset. In other words, even with glitch injection, we have no chance of persisting in memory - any attempt must succeed in one go, without allowing the bootloader code to be executed again.
I'm trying my best to pursue this approach. As crazy as it sounds, on my xilinx zynq7000 FPGA platform implemention, inducing STM32 faults through precise timing has about a 2% - 3% success rate, which means it might be possible to achieve a pwn within a reasonable time frame. Of course, we might still end up with nothing - who knows if ST will introduce any other obstacles along the way.
Regarding voltage fault injection, from what I know: It's not strictly necessary to use an FPGA, even though it is a very suitable solution.
Picofly managed to carry out an attack on the Switch's Tegra X1 chip using just a Raspberry Pi Pico; ESP32_nRF52_SWD even went as far as to pwn the nRF52's SWD disable protection using only an ESP32.
While FPGAs offer higher trigger precision and jitter reliability, these conditions are often "sufficient but not necessary." In fact, a fast enough MCU can achieve the same effect.
Ultimately, my attempt was unsuccessful.
Here is the story that unfolded over these past few weeks:
As mentioned before, I discovered that the STM32F412 I'm working with completely locks access to the entire flash when started in RAM mode, a measure as drastic as when a debugger is attached — it's impossible to unlock flash access without performing a power cycle.
However, I found that starting in system bootloader mode does not enforce a complete lock on the flash.
According to my analysis of the bootloader,
it is indeed possible to construct arbitrary code execution, but this is highly challenging.
Firstly, the bootloader contains a go address command that allows jumping to any address.
Its restriction is that this command can only jump to 0x08000000 (flash) or between 0x20003000-0x20040000.
Any other address will be rejected.
Additionally, the go command isn't a true jump & go: it loads the estack pointer located at address + 0x0 and jumps to the reset handler pointer at +0x4 for execution.
To make the go address usable, we must also find a way to place a valid attack payload at 0x20003000.
This isn't difficult, as the bootloader also has a write memory command that can write data to 0x20003000 in RAM.
(Writing to 0x20000000-0x20002FFF is not allowed because this is the memory used by bootloader, and ST has specific checks for this.)
Notably, when the options byte is set to level 1,
a piece of code in the bootloader's initialization section zeroes out the entire range from 0x20003000 to 0x20040000. Well done, ST.
Therefore, to construct this attack, we must glitch out a combo twice after reset, or it's a no-go.
The write memory command can write a maximum of 256 bytes at a time.
In this scenario, we definitely don't want to glitch the write memory multiple times,
as it would significantly lower the success rate of the combo.
Thus, we must complete all attacks within 256 bytes. A highly potent assembly wizardry is necessary.
Hence, we can immediately come up with the following attack plan:
Indeed, this plan seemed effective, and in fact, I was successful in executing the combo and running the payload in RAM. However, this was futile—I was met with a massive surprise; the data in the flash was directly destroyed.
The data returned by the payload showed that only 0x08000000 - 0x0800000f was correct;
all other addresses just repeated the contents here.
Furthermore, after powering on the STM32F412 normally again, it was no longer able to run the program in the flash.
I have absolutely no idea why this happened.
I tried many times, but the result was always the same.
After about 50~300 continuous glitches, the flash data would be found corrupted.
So, I conducted the following test:
This means that the flash of the STM32F412 was indeed destroyed,
even corrupting the option byte (!= 0xAA or != 0xCC), making it believe that it's RDP level 1 status.
To ensure my reasoning was correct, I also conducted the following test:
This test result proves that my idea was correct.
However, regarding the destruction of the flash, I currently have no solution.
Perhaps the glitching process damaged the logic of the flash controller? I'm really at a loss...
If the flash controller was damaged, perhaps using EMFI (Electromagnetic Fault Injection) or laser fault injection could circumvent this issue.
However, this would no longer be feasible with just an RP2040; the cost would significantly increase, perhaps even to an unacceptable level.
Also, I don't know if other STM32F4 series devices would have the same issue; I haven't tested them yet.
This is all the results I have up to this point, for your reference.
This is pretty amazing and tragic at the same time but may just provide a foundation for a successful exploit. Thanks for the update! I may join the effort once I can find some spare time :)
Nice work, I'm interested in adding changes to script and do dumping on F4, please check this https://github.com/lolwheel/stm32f4-rdp-workaround.git implementation for F4, same as your work but uses esp8266 instead of Pico.
Guide me general direction on how to add the same function to your project. Thanks!