probe-rs / probe-rs

A debugging toolset and library for debugging embedded ARM and RISC-V targets on a separate host
https://probe.rs
Apache License 2.0
1.71k stars 355 forks source link

Debugging and flashing are unstable if using wfi on some STM32 chips #350

Open mvirkkunen opened 4 years ago

mvirkkunen commented 4 years ago

Describe the bug Memory access and flashing have issues on some STM32 chips if the wfi instruction is used. This may apply to other debugging operations too. wfe likely has the same issue.

To Reproduce Steps to reproduce the behavior:

  1. Flash a program that uses wfi onto your chip (any RTIC program with the default idle task will do)
  2. Try to read some memory with probe-rs
  3. Incorrect values (all zeros, whatnot) are returned

Expected behavior Correct values should be returned.

Additional context STM32 chips apparently have super special platform-specific registers that need to be manipulated to make wfi power slightly fewer things down and enable debugging in sleep mode. To fix this, some STM32 specific code will be needed.

Workaround Until probe-rs adds support for this, wfi must be avoided if using something that uses the memory interface (e.g. RTT). In RTIC this can be done by defining your own idle task without wfi. This makes your application much more power hungry though, and shouldn't be used in production.

eb08a167 commented 4 years ago

I can confirm this. Running cargo flash --release --chip STM32F446RETx with this line commented out works fine. However, when uncommented, then cargo flash bricks my nucleo board. st-info --probe outputs

serial:     303636364646333233383331353234313537313834353437
hla-serial: "\x30\x36\x36\x36\x46\x46\x33\x32\x33\x38\x33\x31\x35\x32\x34\x31\x35\x37\x31\x38\x34\x35\x34\x37"
flash:      0 (pagesize: 0)
sram:       0
chipid:     0x0004

I'm no more able to interact with the board. stlink utils complain about unknown chipid and cargo flash fails with SwdDpWait status. The way to recover turned out to be shorting BOOT0 and VDD pins (this restores the chipid) and running st-flash erase.

Also, wfe causes the same issue for me.

David-OConnor commented 4 years ago

I can confirm as well. probe-run will go into SwdApWait once, then SwdDpWait every subsequent run if it hits a WFI. Had to switch back to openOcd.

almusil commented 4 years ago

Temporary workaround at least on L0 device was to enable it from software (which is possible). E.g.

unsafe { (*RCC::ptr()).apb2enr.modify(|_, w| w.dbgen().enabled()) };
unsafe { (*RCC::ptr()).apb2smenr.modify(|_, w| w.dbgsmen().enabled()) };
unsafe {
    (*DBG::ptr()).cr.modify(|_, w| {
         w.dbg_sleep().enabled();
         w.dbg_standby().enabled();
         w.dbg_stop().enabled()
     })
};

It works to some extent, but the channel freezes after couple of interrupts so it still probably missing something.

David-OConnor commented 4 years ago

Does anyone know how to recover the device once this happens? I accidentally added WFE back without switching to openOcd, and now the device can't be flashed.

Edit: Solved, unsatisfactorily. A certain combination of power cycles, pulling boot0 high, and plugging/unplugging the st-link did the job. Non-reproducable.

cavokz commented 3 years ago

Do not forget that st-link provides an usb storage, copy your firmware binary there to have it programmed.

David-OConnor commented 3 years ago

Workaround, from @adamgreig:

In your driver code, clear the DBG_SLEEP, DBG_STANDBY, and DBG_STOP bits of the DBGMCU_CR register. Probe run will then work. For example, if using a hal crate, add this line after setting up the STM32 peripherals, and before hitting a WFI/WFE:

    dp.DBGMCU.cr.modify(|_, w| w.dbg_sleep().clear_bit());
    dp.DBGMCU.cr.modify(|_, w| w.dbg_stop().clear_bit());
    dp.DBGMCU.cr.modify(|_, w| w.dbg_standby().clear_bit());
adamgreig commented 3 years ago

I think those bits need to be set, not cleared, by the way - clearing them is the (POR) reset behaviour, and leaves debug disabled in sleep.

David-OConnor commented 3 years ago

Thanks for the correction!

Here's one way to get the MCU to flash again after experiencing this bug: (@adamgreig pointed me in the right direction)

Download the STM32 ST-LINK utility. (official). Run it. Menu -> Target -> Settings. Under Mode, select Connect Under Reset. Ok. Target -> Connect. Hit the button that looks like a red and white candy bar, with tooltip 'Full Chip erase'. Ok.

Disasm commented 3 years ago

For STM32F411 this workaround works:

let dp = Peripherals::take().unwrap();

dp.DBGMCU.cr.modify(|_, w| {
    w.dbg_sleep().set_bit();
    w.dbg_standby().set_bit();
    w.dbg_stop().set_bit()
});
dp.RCC.ahb1enr.modify(|_, w| w.dma1en().enabled());

This DMA1EN is required to keep at least one AHB bus master active, and prevent reading SRAM as zeros during sleep.

Yatekii commented 3 years ago

Ohhh thanks for figuring this out! I wasn't able to make it work myself!!

burrbull commented 3 years ago

https://www.st.com/content/ccc/resource/technical/document/errata_sheet/c3/6b/f8/32/fc/01/48/6e/DM00155929.pdf/files/DM00155929.pdf/jcr:content/translations/en.DM00155929.pdf#%5B%7B%22num%22%3A37%2C%22gen%22%3A0%7D%2C%7B%22name%22%3A%22XYZ%22%7D%2C67%2C724%2Cnull%5D Found by @Disasm

sjoerdsimons commented 3 years ago

For STM32F411 this workaround works: ... This DMA1EN is required to keep at least one AHB bus master active, and prevent reading SRAM as zeros during sleep.

This also work-around also seems to work on bluepill boards (stm32f103) ; using

dp.RCC.ahbenr.modify (|_, w|  w.dma1en().enabled());                        
eupn commented 3 years ago

For google search results. If anyone got the following error message while using RTT with probe-rs or probe-run:

RTT error: Control block corrupted: write pointer is 4294967273 while buffer size is 4096 for "up" channel 0 (defmt)

Then the reason is the same (there's nothing wrong with RTT) and the workaround for this is kindly provided by @Disasm here. Don't forget to update the workaround code for your chip, it may differ a bit.

sjmackenzie commented 3 years ago

This worked for me:

You should see output something along the lines of:

...
Info : device id = 0x20036410
Info : flash size = 128kbytes
stm32x mass erase complete

shutdown command invoked

Thereafter include these lines

    ...
    let dp = stm32f1xx_hal::pac::Peripherals::take().unwrap();
    dp.DBGMCU.cr.modify(|_, w| {
        w.dbg_sleep().set_bit();
        w.dbg_standby().set_bit();
        w.dbg_stop().set_bit()
    });
    dp.RCC.ahbenr.modify(|_, w| w.dma1en().enabled());
   ...

and you can go back to using probe-run again.

EDIT:

seems even with the rust code it'll eventually become unusable again. My solution is to revert back to openocd and stop using probe-run

rmsc commented 3 years ago

Although @Disasm 's workaround works for me, I don't like setting the bits in firmware. It seems too easy to leave them on by mistake and silently ruining power consumption.

Would it be feasible/desirable to allow probe-rs to do the heavy-lifting?

I can see probe-rs is already configuring the TRACE pins in DBGMCU_CR, but otherwise knows nothing about RTT. On the other hand, cargo-embed knows nothing about the specific MCU details.

I think the following might work:

burrbull commented 3 years ago
  • perhaps also include device-specific workarounds (such as the DMA1EN above)

And break DMA. Very "good" idea.

rmsc commented 3 years ago

And break DMA. Very "good" idea.

Yeah, I can see how doing that without warning would be stupid.

Yatekii commented 3 years ago

I think in general randomly toggling flags is bad. But we have base sequences merged now, which allow us to do exactly this. The user would need to set a flag tho to make it work or apps that use RTT could automatically do it as you suggested. This has been planned for a long time but needs to be done properly.

Disasm commented 3 years ago

Yeah, I can see how doing that without warning would be stupid.

@rmsc could you elaborate? I don't see anything bad in enabling DMA clock.

rmsc commented 3 years ago

I think in general randomly toggling flags is bad. But we have base sequences merged now, which allow us to do exactly this. The user would need to set a flag tho to make it work or apps that use RTT could automatically do it as you suggested. This has been planned for a long time but needs to be done properly.

I agree, but in the specific case of the DBG bits it's clearly stated in the reference manual that these bits are meant to be toggled by the debugger. Only if that's not possible/desirable should it be done in software.

Yeah, I can see how doing that without warning would be stupid.

@rmsc could you elaborate? I don't see anything bad in enabling DMA clock.

That bit is meant to be managed by application code, and application code usually doesn't expect bits to change randomly. I may be wrong, but I don't think enabling it would generally break anything. But it could, in some specific cases. On the other hand, if application code then disables the clock for any reason (say, cleaning up after using DMA before sleep), then RTT would definitely break on the next wfi() or wfe().

rmsc commented 3 years ago

After reading the errata carefully and doing a few tests, I don't think this is being caused by a hardware bug. This is probably the documentation being imprecise.

From what I understood, enabling the DBG bits just keeps the HCLK running and nothing more. According to the datasheet for my device, that is specifically so that a debugger connection can be kept while in a low power mode. So while accessing the MCU core registers works fine, accessing anything else that's not being clocked just returns bogus data (0x400 in my case).

The datasheet for my device says that the SRAM clock may be gated on or off in Sleep mode, but doesn't specify how a user can control it. I suspect that unless some internal device (like the DMA controller) requires SRAM access in sleep mode, it is always gated off.

If we enable an AHB bus master while sleeping, I suspect we are also forcing the SRAM clock to be gated on as a side-effect. I don't think this is intended operation.

If instead we enter Debug Mode while sleeping (core halted) then we get full access to the SRAM. I suspect this is because the AHB-AP gates the SRAM clock on. I also believe this is what we should do for RTT, as it doesn't mess with application code state.

As an example, this works on my device:

match core.status() {
    Ok(CoreStatus::Sleeping) => (),
    _ => panic!("not sleeping"),
}

let _ = core.halt(std::time::Duration::from_millis(10));
let word = core.read_word_32(0x20000010).unwrap();
let _ = core.run();

// inverted SEGG, from "SEGGER RTT"
assert_eq!(word, 0x47474553);

This does require DBG_SLEEP to be set before the device enters Sleep mode, as stated in the datasheet.

EDIT: simplified the example

rmsc commented 3 years ago

I've hacked together a proof of concept here. Works reliably, but has the serious drawback of taking the device out of sleep on every RTT poll.

Perhaps a better approach would be to exploit the fact that RTT buffers won't change while sleeping. So somehow automatically halting just before going to sleep to exchange RTT data, and leaving the device alone when it's sleeping. Not sure how to do the first part, but this later part is pretty easy.

jamesmunns commented 3 years ago

Mentioning in case people find this issue elsewhere (before it is totally fixed):

We were able to "rescue" an STM32F303 that had the offending wfi idle using cargo-flash directly, using something like this (after changing the wfi to a nop:

cargo flash --chip $YOUR_CHIP --connect-under-reset --elf ./target/thumbv.../debug/use-nop-idle
David-OConnor commented 3 years ago

To add to that: I've always been able to resolve this issue (stm32 L4, WB, F3, and H7) by resetting/powering up with boot0 high, then erasing using Stm32cube Programmer.

mvirkkunen commented 3 years ago

Using Boot0 to get to a safe state should work with any debugger and software, since it doesn't even run the application code with a wfi in it. The built-in STMicro bootloaders don't seem to use wfi. It can even be used to recover a chip where you (accidentally or not) disabled SWD in software.

noppej commented 2 years ago

I think in general randomly toggling flags is bad. But we have base sequences merged now, which allow us to do exactly this. The user would need to set a flag tho to make it work or apps that use RTT could automatically do it as you suggested. This has been planned for a long time but needs to be done properly.

@Yatekii can you please give some guidance on how we could use sequences to fix this?

Yatekii commented 2 years ago

So the idea is that there is a sequence hook which enables debugging utilities on a chip. For the ST CMSIS-Packs you can see in what hook they do it:

<sequences>
...
        <sequence name="DebugCoreStart">
          <block>
            // Replication of Standard Functionality
            Write32(0xE000EDF0, 0xA05F0001);                                        // Enable Core Debug via DHCSR

            // Device Specific Debug Setup
            Write32(0xE0042004, DbgMCU_CR);                                         // DBGMCU_CR: Configure MCU Debug
            Write32(0xE0042008, DbgMCU_APB1_Fz);                                    // DBGMCU_APB1_FZ: Configure APB1 Peripheral Freeze Behavior
            Write32(0xE004200C, DbgMCU_APB2_Fz);                                    // DBGMCU_APB1_FZ: Configure APB2 Peripheral Freeze Behavior
          </block>
        </sequence>
...
</sequences>

We should have this hook on the sequence trait already. If we do not we have to introduce it and use it at the right place in probe-rs. The spec here: https://www.keil.com/pack/doc/CMSIS/Pack/html/debug_description.html#pdsc_SequenceNameEnum_pg should tell us what the right place is.

noppej commented 2 years ago

@Yatekii thank you, this is very helpful. I will mess around with it and ask if I have more questions.

noppej commented 2 years ago

@mvirkkunen With the #1118 implementation, I am able to debug quite successfully when my H7 is in wfi. Can you please validate your use case again?

noppej commented 2 years ago

Closing this with the expectation that this is now fixed. Feel free to re-open if your use-case still has challenges.

Dirbaio commented 2 years ago

AIUI #1118 only fixes it for STM32H7, so I think this is still an issue for other STM32 chips.

noppej commented 2 years ago

Good catch. For future readers, it is probably worth pointing out that the fix requires a custom sequence, and will have to be expanded to specific chip lines using the #1118 implementation as a starting point.

gh2o commented 1 year ago

For STM32F411 this workaround works: ... This DMA1EN is required to keep at least one AHB bus master active, and prevent reading SRAM as zeros during sleep.

For Google visibility, this also works for the STM32G4 (more specifically, STM32G474). Thanks @Disasm !

cbiffle commented 1 year ago

Hi folks!

Just wanted to warn you and any readers that enabling those bits in the DBGMCU_CR appears to trigger misbehavior on certain STM32 cores when WFI is used. Placing an ISB immediately after the WFI fixes this; there's no obvious way to fix it from the probe (other than not setting those bits, which as you've noticed breaks debug).

So, make sure users have a way of opting out of this -- in openocd there are the overrideable reset-init/examine events, for instance.

Analysis: https://cliffle.com/blog/stm32-wfi-bug/

Yatekii commented 1 year ago

Maybe we need to reconsider whether automatically setting this in sequences i the right play then ...

adamgreig commented 1 year ago

@cbiffle fun catch! any idea if an isb or dsb before the wfi fixes it too? ARM have this nice guide to when isb/dsb "should" be required in general, though as it only talks about before wfi I don't think it interacts with whatever perhaps-ST-specific bug you've found.

gh2o commented 1 year ago

FWIW FreeRTOS enters sleep by executing a dsb + wfi + isb sequence.

noppej commented 1 year ago

Maybe we need to reconsider whether automatically setting this in sequences i the right play then ...

An alternative is to keep the sequences in play (because they work for all the chips where the wfi bug is not present), and then have different chip/family specific sequences for those chips that are affected by the issue. ???

Yatekii commented 1 year ago

Well, we only do this for broken chips in the sequence :) This is an ST specific thing and we only need to enable this for those chips.

edit: I am not saying disabling all sequences with similar features. I just meant for the ST line :)

lure23 commented 2 months ago

The issue only mentions STM32, but likely also ESP32 chips are affected. As shown by the linked issue by @Yatekii .