zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
9.92k stars 6.11k forks source link

pm: Declare pm state constraints for a device #70111

Closed ceolin closed 1 week ago

ceolin commented 3 months ago

Declare in devicetree which power states would cause power loss in a device. This information can be used by devices to set/release power state constraints to avoid being suspended when they are in the middle of an operation.

ceolin commented 3 months ago

@JordanYates @erwango @hubertmis please a take a look if you can. Any feedback is welcome, especially around naming convention.

erwango commented 3 months ago

On my side, I don't think it is a good approach to declare pm states in device nodes. Interest of dts is when you need to be able to distinguish various behaviors of different instances in the same device driver (either due to devices specifics (IRQ number) or application specifics (speed, pins, ...). I don't think this could happen in current case as all instances are supposed to behave the same way (ie PM state support implemented or not) and have the same constraints. I some rare occasions (see https://github.com/zephyrproject-rtos/zephyr/pull/60369), this might be different, but it doesn't justify the need to have to populate supported PM states in each and every node implementing PM. Proposed approach is overkill to me.

ceolin commented 3 months ago

On my side, I don't think it is a good approach to declare pm states in device nodes. Interest of dts is when you need to be able to distinguish various behaviors of different instances in the same device driver (either due to devices specifics (IRQ number) or application specifics (speed, pins, ...). I don't think this could happen in current case as all instances are supposed to behave the same way (ie PM state support implemented or not) and have the same constraints. I some rare occasions (see #60369), this might be different, but it doesn't justify the need to have to populate supported PM states in each and every node implementing PM. Proposed approach is overkill to me.

It may be indeed. The problem is that is hard to make generic drivers without this information, but to be honest that is not being a source of complains. I was already thinking about making it protected under a build option to avoid waste memory.

But yeah, this increases the complexity and the number of customization for power management. The problem is that we have really different use cases to support. I don't see a way now to a driver, that can be used in different targets. to set constraints to avoid being power off without it. At the moment the driver needs to exactly know which power states the soc uses.

hubertmis commented 2 months ago

Interest of dts is when you need to be able to distinguish various behaviors of different instances in the same device driver

That's exactly the reason to define power states in .dts. An example is nRF54H20 in which one instance of MVDMA gets power cut off in 2 sleep states while the remaining instances lose power in 3 sleep states. So the MVDMA driver needs to distinguish behaviors of different instances regarding power state locking.

erwango commented 2 months ago

An example is nRF54H20 in which one instance of MVDMA gets power cut off in 2 sleep states while the remaining instances lose power in 3 sleep states. So the MVDMA driver needs to distinguish behaviors of different instances regarding power state locking.

Then maybe instead of defining all states in all nodes, we can define only the specific states in specific nodes. See https://github.com/zephyrproject-rtos/zephyr/pull/59200/files#diff-5bd998735a6cb86347623d54854157ee8e578ecee7507873c953c18fe4d01d2c

ceolin commented 2 months ago

An example is nRF54H20 in which one instance of MVDMA gets power cut off in 2 sleep states while the remaining instances lose power in 3 sleep states. So the MVDMA driver needs to distinguish behaviors of different instances regarding power state locking.

Then maybe instead of defining all states in all nodes, we can define only the specific states in specific nodes. See https://github.com/zephyrproject-rtos/zephyr/pull/59200/files#diff-5bd998735a6cb86347623d54854157ee8e578ecee7507873c953c18fe4d01d2c

I don't think it is the same thing (unless I am missing something). These notifications don't allow you to stop the SoC to sleep, they are sent after the policy has chosen the power state. What this change provides is a way to devices set constraints to forbid the SoC to use specific power states.

erwango commented 2 months ago

I don't think it is the same thing (unless I am missing something). These notifications don't allow you to stop the SoC to sleep, they are sent after the policy has chosen the power state. What this change provides is a way to devices set constraints to forbid the SoC to use specific power states.

I'm not commenting about the notifications, but about what to document in device tree. In this regard, this is close: The point is to be able to document in which states a peripheral supporting PM will loose context. For peripheral where PM is not supported, device PM is not coming into play anyway. Note: we don't need to document states where context is lost by default (standby for instance)

Also one question for @hubertmis: what action do you want to take for the states where MVDMA gets cut off ?

hubertmis commented 2 months ago

Also one question for @hubertmis: what action do you want to take for the states where MVDMA gets cut off ?

While MVDMA operation is active, the CPU shall not enter any of the power states cutting power off the active MVDMA instance. Otherwise, interrupted DMA operations could result in data corruption.

So I expect code like this in the MVDMA driver:

pm_handling_function() {
  switch (action) {
    case PM_DEVICE_ACTION_SUSPEND:
        /* suspend the device */
        pm_policy_device_power_lock_put(dev);
        break;
    case PM_DEVICE_ACTION_RESUME:
        /* resume the device */
        pm_policy_device_power_lock_get(dev);
        reinit_mvdma_instance(dev); // because power of this instance might have been disabled
        break;
}

dma_start() {
#ifdef CONFIG_PM_DEVICE_RUNTIME
  pm_device_runtime_get(dev);
#else
  pm_policy_device_power_lock_get(dev);
  reinit_mvdma_instance(dev);
#endif
  ...
}

dma_end_handler() {
  ...
#ifdef CONFIG_PM_DEVICE_RUNTIME
  pm_device_runtime_put(dev);
#else
  pm_policy_device_power_lock_put(dev);
#endif
}
ceolin commented 2 months ago

I don't think it is the same thing (unless I am missing something). These notifications don't allow you to stop the SoC to sleep, they are sent after the policy has chosen the power state. What this change provides is a way to devices set constraints to forbid the SoC to use specific power states.

I'm not commenting about the notifications, but about what to document in device tree. In this regard, this is close: The point is to be able to document in which states a peripheral supporting PM will loose context. For peripheral where PM is not supported, device PM is not coming into play anyway.

That is not true, even if the device does not implement "device pm" it still may need to prevent the system to sleep when it is doing something.

Also one question for @hubertmis: what action do you want to take for the states where MVDMA gets cut off ?

  • prevent the system to enter these states ?
  • be able to re-initialize MVDMA when you detect that this state is exited ?
ceolin commented 1 month ago

why is this not something that can be handled somehow through power domains?

Because power domains don't control the SoC state, power domains are (in a really abstract and superficial way) a cluster of devices. What we want't here is basically to influence the pm policy to not use specific power states that could cause power loss (or something else that compromises the device behavior) when the device is in use.

ceolin commented 4 weeks ago

@hubertmis would mind take another look ?

mmahadevan108 commented 3 weeks ago

@ceolin, thanks for the PR. Overall the changes look fine. Can you please update the documentation at the below link as well. https://docs.zephyrproject.org/latest/services/pm/system.html

ceolin commented 3 weeks ago

@ceolin, thanks for the PR. Overall the changes look fine. Can you please update the documentation at the below link as well. docs.zephyrproject.org/latest/services/pm/system.html

Absolutely, thanks for reviewing it.

erwango commented 3 weeks ago

@ceolin As already mentioned, what bothers me with this PR is the need to document this property in all devices nodes, while in most cases, the behavior is just "by definition" of the power state. What about adding a property in power states nodes, which will provide the default behavior applied in general and then this behavior could be amended upon use of the use of zephyr,disabling-power-states device property on devices that have a specific behavior vs the default power state definition ?

ceolin commented 3 weeks ago

@ceolin As already mentioned, what bothers me with this PR is the need to document this property in all devices nodes, while in most cases, the behavior is just "by definition" of the power state. What about adding a property in power states nodes, which will provide the default behavior applied in general and then this behavior could be amended upon use of the use of zephyr,disabling-power-states device property on devices that have a specific behavior vs the default power state definition ?

@erwango that is optional. Only devices that need to set constraints need to use it. I see what you mean, and that is true for most scenarios. But depending on the topology, a devices may be in different power rails and different power states can turn off different power rails. Keeping it close to the device provides a fine grain that covers all cases (at least it should).

If your concern is due resources needed, we can optimize it, keeping this information out of the device struct and only devices that need it will create it. Similar to what we do with the pm struct. But honestly, this can be addressed later. For now that is optional, as you can see there is no change in any device and consequently no changes in resources consumption.

So far, I got feedback from different targets that this feature is useful.

erwango commented 2 weeks ago

@erwango that is optional. Only devices that need to set constraints need to use it. I see what you mean, and that is true for most scenarios. But depending on the topology, a devices may be in different power rails and different power states can turn off different power rails. Keeping it close to the device provides a fine grain that covers all cases (at least it should).

Ok for this concern.

Would it then be possible to address following other comments:

ceolin commented 2 weeks ago

@erwango that is optional. Only devices that need to set constraints need to use it. I see what you mean, and that is true for most scenarios. But depending on the topology, a devices may be in different power rails and different power states can turn off different power rails. Keeping it close to the device provides a fine grain that covers all cases (at least it should).

Ok for this concern.

Would it then be possible to address following other comments:

I don't exactly how to answer it generically since we don't have this information anywhere today, I mean no power state tells you it. From device perspective (when the soc goes to listed states) it is the same of turn off action. Each device may interpret it as it needs. That is one of the reason it is optional.

It can be done, but the cost is making it less flexible and transfer some logic to the policy. I rather to have it generic and then based on more usage see if we can generalize it.

ceolin commented 1 week ago

@hubertmis thanks for the review. I've to rebase it. Would you mind to refresh your approval ?

ceolin commented 1 week ago

@erwango can you take another look please ?

erwango commented 1 week ago

Removing my blocker, but not approving as I stand on my last comments. See https://github.com/zephyrproject-rtos/zephyr/pull/70111#issuecomment-2141565699

ceolin commented 1 week ago

Removing my blocker, but not approving as I stand on my last comments. See #70111 (comment)

Fair enough, we can revisit it later, it is not written in stone. It has valid use cases now and if we find a simpler way to do it after using it we can always change :)