device power management irq lock

wentongwu commented 4 years ago

When system going to suspend, devices are also required to be put in low power state. But currently it runs with irq locked, that's not correct for some uses, consider that device driver need some software sync with device firmware, which should be done with device interrupt involved. To overcome this, sys_suspend will be ran with scheduler locked(k_sched_lock) and before the actual power state setting, a xxx_state_prepare/notify will be called first to get devices do that sync with irq enabled, and then do the actual device power state setting with irq locked.

However during xxx_state_prepare/notify of every device, there is possibility that interrupts from wakeup source comes first, in that situation a global wakeup count will be defined to record wakeup happened or not, and it will be checked before actual device power state setting with irq locked, if happened, the ongoing suspend will be stopped; if not the suspend will continue.

And the wakeup source device should be marked and parsed by DTS.

wentongwu commented 4 years ago

@pabigot @vanti @erwango @mnkp @fulong82

pabigot commented 4 years ago

I'm skeptical of a model that requires everything be done at the point the system is trying to suspend. Some device PM operations require a thread to be involved: that's why there's an onoff service (updated in #23898), and that won't work from _sys_suspend() where transitions must be synchronous. The whole prepare/notify coordination solution seems fragile, and may well work only because it runs when interrupts are locked preventing new demands from being registered.

Can we do a device power management solution that's more consistent with TI's SimpleLink architecture where we have constraints that block entry to lower-power modes, dependencies (from devicetree mostly) for things that have to be available for a device to work, and overall an architecture where devices automatically go to low power mode when they're not in use? There might be a use case for doing simple things in _sys_suspend(), but it might not be necessary.

I do find useful the distinction in #24228 of runtime idle being basically what we do in the idle thread when system power management is not enabled. It would be nice if device power management were compatible with that (in that nothing would need to be done because idle devices are already powered down when there are no constraints).

aurel32 commented 4 years ago

Can we do a device power management solution that's more consistent with TI's SimpleLink architecture where we have constraints that block entry to lower-power modes, dependencies (from devicetree mostly) for things that have to be available for a device to work, and overall an architecture where devices automatically go to low power mode when they're not in use? There might be a use case for doing simple things in _sys_suspend(), but it might not be necessary.

In addition to that, it would be a nice improvement if active devices could impact the choice of the next power saving level. For example on STM32, it is possible to do I2C transfers in STOP1, but not in STOP2. With the current framework the only option for a device is to simply reject going into power saving mode.

I do find useful the distinction in #24228 of runtime idle being basically what we do in the idle thread when system power management is not enabled. It would be nice if device power management were compatible with that (in that nothing would need to be done because idle devices are already powered down when there are no constraints).

:+1: :+1: :+1:

It would also mean some power saving even when the CPU is running, which is something nice to have. I think it's the idea behind the DEVICE_IDLE_PM option, but I am not sure it actually works, I was planning to look at that in the next days.

wentongwu commented 4 years ago

I'm skeptical of a model that requires everything be done at the point the system is trying to suspend. Some device PM operations require a thread to be involved: that's why there's an onoff service (updated in #23898), and that won't work from _sys_suspend() where transitions must be synchronous. The whole prepare/notify coordination solution seems fragile, and may well work only because it runs when interrupts are locked preventing new demands from being registered.

Haven't got chance to look at onoff service, but will do, maybe it's the same idea with current device runtime PM where individual device will be off if no one owns it. BTW, current device runtime PM also need the device dependencies to decide on/off which hasn't been considered so far. And the synchronous state transitions is trying to follow the definition in #24228, where system makes the state transition decision, and it will exist along with device runtime PM in my current idea.

wentongwu commented 4 years ago

Can we do a device power management solution that's more consistent with TI's SimpleLink architecture where we have constraints that block entry to lower-power modes, dependencies (from devicetree mostly) for things that have to be available for a device to work, and overall an architecture where devices automatically go to low power mode when they're not in use? There might be a use case for doing simple things in _sys_suspend(), but it might not be necessary.

In addition to that, it would be a nice improvement if active devices could impact the choice of the next power saving level. For example on STM32, it is possible to do I2C transfers in STOP1, but not in STOP2. With the current framework the only option for a device is to simply reject going into power saving mode.

we have that API(sys_pm_ctrl_disable_state/sys_pm_ctrl_enable_state) to impact the state decision in current code, but it should belong to power policy layer's API.

wentongwu commented 4 years ago

Can we do a device power management solution that's more consistent with TI's SimpleLink architecture where we have constraints that block entry to lower-power modes, dependencies (from devicetree mostly) for things that have to be available for a device to work, and overall an architecture where devices automatically go to low power mode when they're not in use? There might be a use case for doing simple things in _sys_suspend(), but it might not be necessary.

thanks for sharing this doc, and besides dependencies to work well, we also need dependencies to off, I agree with the overall architecture proposal, will review the existing onoff service code first.

vanti commented 4 years ago

Good discussion. Adding a few observations:

I'm skeptical of a model that requires everything be done at the point the system is trying to suspend.

I concur with this skepticism. It is better to turn off peripheral devices as soon as they are not needed for better power savings.

There might be a use case for doing simple things in _sys_suspend(), but it might not be necessary.

Such a use-case arises when a device is configured and ready to go, but is not yet actively performing a transfer when the system is suspended. It is good to have the flexibility to still suspend the device along with the system, and simply notify the driver to restore the state of the device after resuming (assuming the device loses its state during system suspend, as is the case on the TI CC1352). This is in line with the notification feature in the TI SimpleLink power manager.

That said, a device that is not actively used nor configured should definitely be turned off asap.

I do find useful the distinction in #24228 of runtime idle being basically what we do in the idle thread when system power management is not enabled.

Interestingly, on the TI CC13x2 I currently have it such that "runtime idle" is only available when system power management is turned on. When system power management is off there is truly no power management. It's good to see we are working to formalize/clarify this, so that we could be consistent across platforms.

It would be nice if device power management were compatible with that (in that nothing would need to be done because idle devices are already powered down when there are no constraints).

👍 👍 👍

It would also mean some power saving even when the CPU is running, which is something nice to have. I think it's the idea behind the DEVICE_IDLE_PM option, but I am not sure it actually works, I was planning to look at that in the next days.

From my (brief) experience with it, it is not currently well integrated with the "central method" (see e.g. #22391). It would also benefit from an ability to manage power domains (a partitioning of the SoC that needs to be power-gated together as a whole, which can probably be defined in DT). The TI SimpleLink power manager manages dependencies on both peripheral devices and power domains.

In addition to that, it would be a nice improvement if active devices could impact the choice of the next power saving level. For example on STM32, it is possible to do I2C transfers in STOP1, but not in STOP2. With the current framework the only option for a device is to simply reject going into power saving mode.

we have that API(sys_pm_ctrl_disable_state/sys_pm_ctrl_enable_state) to impact the state decision in current code, but it should belong to power policy layer's API.

Yes state locking via sys_pm_ctrl_disable_state/sys_pm_ctrl_enable_state is what I used on the CC13x2 as a substitute for setting and releasing constraints.

pabigot commented 4 years ago

we have that API(sys_pm_ctrl_disable_state/sys_pm_ctrl_enable_state) to impact the state decision in current code, but it should belong to power policy layer's API.

Yes state locking via sys_pm_ctrl_disable_state/sys_pm_ctrl_enable_state is what I used on the CC13x2 as a substitute for setting and releasing constraints.

This is the sort of thing the on-off manager was intended to handle. With the current sys_pm_ctrl API states are enabled and disabled without any regard for whether somebody else agrees, which leads to questions about how to use it.

On-off only supports a two-state configuration, though. The most general approach involves defining a lattice for the values and selecting the best option consistent with all the requested configurations.

vanti commented 4 years ago

This is the sort of thing the on-off manager was intended to handle. With the current sys_pm_ctrl API states are enabled and disabled without any regard for whether somebody else agrees, which leads to questions about how to use it.

On-off only supports a two-state configuration, though. The most general approach involves defining a lattice for the values and selecting the best option consistent with all the requested configurations.

The on-off manager certainly looks like an interesting approach to manage on-off states of resources/services. For managing system power states, I haven't personally found the sys_pm_ctrl API to be too restrictive (assuming the states are left as enabled by default, enabling a state only after disabling it first in a given context, and using it to protect code that cannot accommodate for sleep interruptions). I do think the API should probably have been better documented though.

wentongwu commented 3 years ago

@pabigot do we require all devices' the power state transition asynchronous or we should cover both, asynchronous and synchronous? That's important not only device runtime pm, but also system pm.

pabigot commented 3 years ago

There certainly will be instantaneous or synchronous transitions along with ones that are not instantaneous so should be asynchronous. Toggling a GPIO to control a switch seems simple enough to not need any overhead.

But the question becomes whether it's practical to support two APIs for a single operation, especially since the component initiating the operation (the policy engine) may not know for a given transition whether a delay will occur. It's pretty easy to initiate an operation through an asynchronous API then check to see whether it completed (or just rely on the infrastructure set up to process that completion whenever it occurs, which may be before the call to initiate it completes).

We also haven't determined what a transition means, specifically for a device power state? Sticking to the simple case of "on" and "off", does a device become "on" when it's receiving power (a regulator has been enabled), or when its documented RESET hold-off time has completed, or when its driver has reinitialized it and it's available for use? For the on-off service it's generally expected to mean "ready for use" (though any other condition could be used instead).

It's that "off to ready-to-use" cases where it's pretty clear a transition can take a long time and an asynchronous implementation is useful.

wentongwu commented 3 years ago

But the question becomes whether it's practical to support two APIs for a single operation,

@pabigot make more sense for single API. Let's do the implementation based on asynchronous transition for both device runtime pm and system pm.

We also haven't determined what a transition means, specifically for a device power state? Sticking to the simple case of "on" and "off", does a device become "on" when it's receiving power (a regulator has been enabled), or when its documented RESET hold-off time has completed, or when its driver has reinitialized it and it's available for use? For the on-off service it's generally expected to mean "ready for use" (though any other condition could be used instead).

should mean "ready for use" which is also the understanding of #26366 though it's synchronous. We can initialize the transition in interrupt, but we can't make device ready for use in interrupt where we may access device which is off.

wentongwu commented 3 years ago

@pabigot I'm considering the low level API of device pm, could you please give more detail about asynchronous power transition use case? Thanks

pabigot commented 3 years ago

An example follows from #27360: if the Thingy52 enters low power mode where the CCS811 is to be turned off its power source should be removed, which requires an I2C transaction over the SX1509B.

Also when the SX1509B is turned on it has to be configured, which involves I2C operations. While right now there isn't an asynchronous I2C API, there will be, and it would be more efficient to allow other devices to be turned off while the I2C bus transactions are being completed.

wentongwu commented 3 years ago

it would be more efficient to allow other devices to be turned off while the I2C bus transactions are being completed.

That's fine from individual device driver perspective, but consider suspending all the devices case as system pm framework required somehow, and assume we get the device dependency list somehow, when queuing device suspend request one by one we may need maintain a device list which just queued the request but not finished the suspend operation, so that the parents of these devices will not be suspended until they finished the operation, that means when looking whether a new device can be suspend or not we should go over the parents of the listed devices once, also we should maintain a finished devices list because if someone failed the suspend operation, we can resume them back, my point is, do we really want to implement such complex things in an embedded system? @pabigot

Also I'm considering how to use the on-off service to implement the device runtime pm framework, #27360 is a good example showing how to use it, but we also need consider embed the device dependency info to it.

pabigot commented 3 years ago

I don't think it's that complex. Each device has its current state: active, suspending, suspended. On system shutdown you walk them in a loop looking at each one:

If it's suspended, it's done.
If it's suspending, nothing to do yet.
If it's active and everything that depends on it is suspended[*] initiate a transition to suspended.

As suspended completion notifications come in update the device state and schedule another walk.

[*] This check isn't directly available with the as-designed dependency relations which go the other direction, but it's trivial to do if the device maintains a counter of things that depend on it, which is incremented when a subordinate device begins a transition to active and decremented when that device completes a transition to suspended.

zephyrproject-rtos / zephyr

device power management irq lock #24230