zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.44k stars 6.4k forks source link

Some STM32 series require CONFIG_PM_DEVICE if CONFIG_PM=y #38452

Closed gmarull closed 2 years ago

gmarull commented 2 years ago

Describe the bug

Some STM32 series (g0, l0, l4, l5, wb, wl) require CONFIG_PM_DEVICE if CONFIG_PM=y. However, one should be able to have system PM enabled (CONFIG_PM=y) without the need of CONFIG_PM_DEVICE. It turns out that the device PM hooks are used in some cases, e.g. UART, to block system suspension until the peripheral is done. For more context, see https://github.com/zephyrproject-rtos/zephyr/pull/38383#discussion_r705402367. https://github.com/zephyrproject-rtos/zephyr/pull/38432 proposes an alternative solution that should allow disabling CONFIG_PM_DEVICE even when CONFIG_PM=y.

To Reproduce

Look at:

soc/arm/st_stm32/stm32g0/Kconfig.defconfig.series
soc/arm/st_stm32/stm32l0/Kconfig.defconfig.series
soc/arm/st_stm32/stm32l4/Kconfig.defconfig.series
soc/arm/st_stm32/stm32l5/Kconfig.defconfig.series
soc/arm/st_stm32/stm32wb/Kconfig.defconfig.series
soc/arm/st_stm32/stm32wl/Kconfig.defconfig.series

Expected behavior

CONFIG_PM_DEVICE should not be a requirement to enable CONFIG_PM.

Impact

Logs and console output

N/A

Environment (please complete the following information):

Additional context

https://github.com/zephyrproject-rtos/zephyr/issues/38382 https://github.com/zephyrproject-rtos/zephyr/pull/38383 https://github.com/zephyrproject-rtos/zephyr/pull/38432

erwango commented 2 years ago

Note that no STM32 drivers implement actual device PM as of today.

Well, gpio driver does... That's PM_DEVICE_RUNTIME actually. But I agree on the point. We should be able to enable only PM w/o PM_DEVICE

gmarull commented 2 years ago

Note that no STM32 drivers implement actual device PM as of today.

~Well, gpio driver does...~ That's PM_DEVICE_RUNTIME actually. But I agree on the point. We should be able to enable only PM w/o PM_DEVICE

You're right, I missed that one.

gmarull commented 2 years ago

Note that no STM32 drivers implement actual device PM as of today.

~Well, gpio driver does...~ That's PM_DEVICE_RUNTIME actually. But I agree on the point. We should be able to enable only PM w/o PM_DEVICE

Actually, the code will not do PM when doing system PM (because low power is used for the PM state STM32 handles) but will do PM when using the runtime API. I think this illustrates quite well the problem of having LOW_POWER and SUSPEND that in theory should do different things, but the way they are handled forces drivers to start using the "fall through pattern" just in case. @ceolin

ceolin commented 2 years ago

It is clear a problem these drivers need to enable CONFIG_DEVICE_PM to workaround the system PM. These options can be selected independently if needed.

Regarding the low power, I don't see a problem it as a major problem, from the subsystem perspective it is a hint to the device to save energy. It is up to the device decide what to do with it. That is something that will always be device specific, even if we had only SUSPEND devices would still may interpret it in different manners (they may have multiple saving states).

The subsystem uses LOW_POWER when the sleep state is shallow and it needs to wake up fast, drivers can use it to set their equivalent state collaborating with the overall system.

This discussion is not particular to this issue, but I see pros/cons in have multiple device states. The policy manager for devices can solve this problem. The proposed solution (https://github.com/zephyrproject-rtos/zephyr/pull/37823) is just replicating the current behavior, but we can device power states information per device in DT and make the policy consume it.

gmarull commented 2 years ago

The problem of having both LOW_POWER and SUSPEND and treating them independently can cause problems. For example, a device can decide to just implement LOW_POWER, SUSPEND or both. Let's take a system with N devices, each implementing a random combination of the previous list. Now the system goes to a low power state where SUSPEND is used to put devices into a low power state. There will be a subset of devices, those implementing LOW_POWER only, that will just be left ON even though they can be put into a state that would save some power. The same would happen if LOW_POWER is used. The policy proposal in #37823 doesn't help to solve such cases. If there are multiple low power states, as there are now, the PM subsystem must be smart enough to take advantage of all cases. We can't move this problem to devices forcing to handle all cases, since then there's virtually no difference between LOW_POWER and SUSPEND states.

ceolin commented 2 years ago

The problem of having both LOW_POWER and SUSPEND and treating them independently can cause problems. For example, a device can decide to just implement LOW_POWER, SUSPEND or both. Let's take a system with N devices, each implementing a random combination of the previous list. Now the system goes to a low power state where SUSPEND is used to put devices into a low power state. There will be a subset of devices, those implementing LOW_POWER only, that will just be left ON even though they can be put into a state that would save some power. The same would happen if LOW_POWER is used.

Note that the system power states documentation says what is more or less expected from a device when the system goes to state.

    /**
     * @brief Runtime idle state
     *
     * Runtime idle is a system sleep state in which all of the cores
     * enter deepest possible idle state and wait for interrupts, no
     * requirements for the devices, leaving them at the states where
     * they are.
     *
     * @note This state is correlated with ACPI S0ix state
     */
    PM_STATE_RUNTIME_IDLE,
    /**
     * @brief Suspend to idle state
     *
     * The system goes through a normal platform suspend where it puts
     * all of the cores in deepest possible idle state and *may* puts peripherals
     * into low-power states. No operating state is lost (ie. the cpu core
     * does not lose execution context), so the system can go back to where
     * it left off easily enough.
     *
     * @note This state is correlated with ACPI S1 state
     */
    PM_STATE_SUSPEND_TO_IDLE,

The policy proposal in #37823 doesn't help to solve such cases. If there are multiple low power states, as there are now, the PM

It does not solve the problem as I stated there, but it is one step further, if the the information about which states a device implements were available the policy could be smarter.

subsystem must be smart enough to take advantage of all cases. We can't move this problem to devices forcing to handle all

How exactly the subsystem must be smart ? Removing LOW_POWER the only thing that the subsystem will be able to do is suspend or not suspend. All the hard work will still needed to be implemented by the device.

cases, since then there's virtually no difference between LOW_POWER and SUSPEND states.

What you mean here ? That devices are only active or suspended ? Or that most drivers were handling both states are the same ?

gmarull commented 2 years ago

The problem of having both LOW_POWER and SUSPEND and treating them independently can cause problems. For example, a device can decide to just implement LOW_POWER, SUSPEND or both. Let's take a system with N devices, each implementing a random combination of the previous list. Now the system goes to a low power state where SUSPEND is used to put devices into a low power state. There will be a subset of devices, those implementing LOW_POWER only, that will just be left ON even though they can be put into a state that would save some power. The same would happen if LOW_POWER is used.

Note that the system power states documentation says what is more or less expected from a device when the system goes to state.

  /**
   * @brief Runtime idle state
   *
   * Runtime idle is a system sleep state in which all of the cores
   * enter deepest possible idle state and wait for interrupts, no
   * requirements for the devices, leaving them at the states where
   * they are.
   *
   * @note This state is correlated with ACPI S0ix state
   */
  PM_STATE_RUNTIME_IDLE,
  /**
   * @brief Suspend to idle state
   *
   * The system goes through a normal platform suspend where it puts
   * all of the cores in deepest possible idle state and *may* puts peripherals
   * into low-power states. No operating state is lost (ie. the cpu core
   * does not lose execution context), so the system can go back to where
   * it left off easily enough.
   *
   * @note This state is correlated with ACPI S1 state
   */
  PM_STATE_SUSPEND_TO_IDLE,

The system PM code doesn't seem to follow these specs:

There is also some lack of clarity:

The policy proposal in #37823 doesn't help to solve such cases. If there are multiple low power states, as there are now, the PM

It does not solve the problem as I stated there, but it is one step further, if the the information about which states a device implements were available the policy could be smarter.

subsystem must be smart enough to take advantage of all cases. We can't move this problem to devices forcing to handle all

How exactly the subsystem must be smart ? Removing LOW_POWER the only thing that the subsystem will be able to do is suspend or not suspend. All the hard work will still needed to be implemented by the device.

System can decide to fall back to, e.g. SUSPEND if TURN_OFF is not supported when turning the system off. Or use LOW_POWER if SUSPEND is not implemented, etc. As of today, this is effectively deferred to the drivers, which led to "handle all just in case" pattern. This has many associated problems, e.g.:

cases, since then there's virtually no difference between LOW_POWER and SUSPEND states.

What you mean here ? That devices are only active or suspended ? Or that most drivers were handling both states are the same ?

A driver that just supports one of the actions, LOW_POWER or SUSPEND is basically forced to handle both the same way in practice. For example, STM32 only makes use of PM_STATE_SUSPEND_TO_IDLE, which in turn only uses PM_DEVICE_ACTION_LOW_POWER. This means that all devices that want to have a chance of entering low power mode in such system, need to handle LOW_POWER. Things just get worse if we consider the runtime API, where only SUSPEND is considered.

So to summarize, I'd prefer a system that just has SUSPEND handled properly than a system that in theory offers one more level, but that in practice, it is not handled consistently.