embassy-rs / embassy

Modern embedded framework, using Rust and async.
https://embassy.dev
Apache License 2.0
5.48k stars 765 forks source link

low-power: embassy-executor sets alarm with Instant::MAX #2381

Open chrenderle opened 10 months ago

chrenderle commented 10 months ago

I found a problem with the embassy-executor and the low-power feature in embassy-stm32. In my example the executor sets an alarm for Instant::MAX when I spawn my (only) task. https://github.com/embassy-rs/embassy/blob/430696802e31701f23d2c8e9d5fc8c796cf277e1/embassy-executor/src/raw/mod.rs#L421 The low-power executor then goes to sleep with the maximum rtc wakeup time, which is 32s because of the Instant::MAX alarm set. After this it checks the alarm again which isn't Instant::MAX anymore. From then on it behaves correctly. But because of this initial Instant::MAX alarm the MCU always sleeps 32s before executing the program correctly. I assume this is because it polls the raw executor once again where my task calls Timer::after, which then sets the alarm properly.

The question now is what is the best way to fix this. You could fix this by only calling set_alarm in the executor when the timestamp is not Instant::MAX. There might also be some other fix in the executor. But in any case I don't want to make any changes there since I don't know all requirements and don't want to break anything.

The other way to fix this would be to have the time driver discard any set_alarm calls which pass Instant::MAX as a timestamp.

In any case it would be good if someone more experienced in the executor and the time driver of stm32 could have a look into this.

I uploaded a fix to my embassy fork (https://github.com/chrenderle/embassy/commit/7d6b7eab566e0e043648843ef7974123320e4cc9). It is only a demonstration how you could fix it in the executor. I tested it on my NUCLEO-L552ZE-Q. But this does currently not compile with STM32L5, because https://github.com/embassy-rs/stm32-data/pull/325 is not yet merged. I worked around this by building stm32-metapac locally.

My fork also includes another minor change (https://github.com/chrenderle/embassy/commit/96edbcc2665df26ed1c65765d41c237e5c5aca96). Without it the low-power executor does not always suspend execution properly. This is later necessary for STOP suspension. Without this change the error I'm describing is also not active, since the low-power executor doesn't suspend properly.

Dirbaio commented 10 months ago

Setting the alarm to u64::MAX is expected, it's essentially signaling "no tasks are currently waiting for a timer, just disable the alarm" to the driver. (the RTC driver could even handle it by going to sleep with the WUT completely disabled).

What should be happening is:

My fork also includes another minor change (https://github.com/chrenderle/embassy/commit/96edbcc2665df26ed1c65765d41c237e5c5aca96). Without it the low-power executor does not always suspend execution properly. This is later necessary for STOP suspension. Without this change the error I'm describing is also not active, since the low-power executor doesn't suspend properly.

I think this might be the what has introduced the issue. As detailed above, the design relies on the event flag to prevent suspending when there's more work to do. It's normal if it sometimes doesn't suspend and does a 2nd loop, it happens if there's more work to do.

chrenderle commented 10 months ago

Okay thanks, I understand it now. I thought poll would only return when all work was done. But when pending is called (by spawning a task or an interrupt) it needs to be executed again. This is signaled by sev inside pending.

Also I chased some very weird behaviours which confused me and I went in the wrong direction. That was basically the motivation for these patches.

For example when changing LPMS inside the PWR_CR1 register. You need to do this to specify which Low-power mode the MCU should use when WFE is called (and some more bits are set). I initially wrote to this register with crate::pac::PWR.cr1().write(|w| w.set_lpms(...)); This caused some weird behaviour in the WUTWF flag of the RTC and in an endless loop waiting for it to become false (which it didn't). But somehow when writing to it with modify instead of write it works as expected.

I will create a PR for the STOP support in the next days. I can confirm it works on my Nucleo-L552, but I need to test it some more.