Open MerlijnWajer opened 3 years ago
I can currently hit RET with our 5.15 kernel and this script:
# cat setup-idle.sh
#!/bin/bash
mount -t proc none /proc
mount -t sysfs none /sys
mount -t debugfs none /sys/kernel/debug
mount -o rw,remount /
consoles=$(find /sys/bus/platform/devices/4*.serial/ -name console)
for console in ${consoles}; do
echo N > ${console}
done
# Enable autosuspend
uarts=$(find /sys/bus/platform/devices/4*.serial/power/ -type d)
for uart in ${uarts}; do
echo 2000 > ${uart}/autosuspend_delay_ms
echo enabled > ${uart}/wakeup
echo auto > ${uart}/control
done
# Configure wake-up from suspend
uarts=$(find /sys/class/tty/tty[SO]*/power/ -type d)
for uart in ${uarts}; do
echo enabled > ${uart}/wakeup
done
echo 1 > /sys/kernel/debug/pm_debug/enable_off_mode
And current instructions (idle.sh
output here is wrong, but the script here is fixed):
./setup-idle.sh
Wait a bit, and run sleep 5 ; ./idle.sh
And check if OFF or RET are increasing
# sleep 5; ./idle.sh
ST_MCSPI1
OFF:0,RET:458
Then, to at least turn off the backlight:
# modprobe panel-sony-acx565akm
# echo 0 > /sys/class/backlight/acx565akm/brightness
Which adds UART2 as a blocker, but pm is better
# sleep 5; ./idle.sh
ST_UART2,ST_MCSPI1
OFF:0,RET:806
to get power measurements:
# modprobe bq27xxx_battery
# modprobe bq27xxx_battery_i2c
# modprobe bq2415x_charger
# cat /sys/class/power_supply/bq27200-0/power_avg
89060
idle.sh:
blocker_bits=$(cat /sys/kernel/debug/pm_debug/count | grep idlest1 | awk '{print $7}')
#blocker_bits=$(cat /sys/kernel/debug/pm_debug/count | grep idlest | awk '{print $7}')
blockers=`python3 - $blocker_bits << EOF
import sys
# 31 to 0
cm_idlest1_core_bits = [ 'RESERVED', 'ST_MMC3', 'ST_ICR',
'RESERVED', 'RESERVED', 'RESERVED', 'ST_MMC2', 'ST_MMC1',
'RESERVED', 'ST_HDQ', 'ST_MCSPI4', 'ST_MCSPI3', 'ST_MCSPI2',
'ST_MCSPI1', 'ST_I2C3', 'ST_I2C2', 'ST_I2C1', 'ST_UART2',
'ST_UART1', 'ST_GPT11', 'ST_GPT10', 'ST_MCBSP5', 'ST_MCBSP1',
'RESERVED', 'ST_MAILBOXES', 'ST_OMAPCTRL', 'ST_HSOTGUSB_IDLE',
'ST_HSOTGUSB_STDBY', 'RESERVED', 'ST_SDMA', 'ST_SDRC', 'RESERVED',
]
cm_idlest1_core_bits = list(reversed(cm_idlest1_core_bits))
inp = sys.argv[1]
v = int(inp, 16)
b= '{0:b}'.format(v)
blockers = []
for i in range(0, 32):
is_set = (v & (1 << i)) >> i
if is_set:
blockers.append(cm_idlest1_core_bits[i])
print(','.join(blockers),end='')
EOF`
echo $blockers;
idle=$(grep ^core_pwrdm /sys/kernel/debug/pm_debug/count | cut -d',' -f2,3)
echo $idle;
I cannot get drm to disable the display currently, using https://github.com/IMbackK/drm_blankscreen - it reports success it seems, but it doesn't work. This is with panel driver and omapdrm loaded.
I will try lowering the target_residency values now as Tony suggests, see if OFF mode is hit. After that, I guess we'll need some way to load each and all of the modules that we normally use, and then see which ones block idle.
EDIT: Lowering the values helped, but it does not make a big difference since the kernel wakes up quite often
Blocks idle (most of the time?):
tsc2005
(ST_MCSPI1
)omap3_isp
not sure about the blocker, but blocks idle for sure (ST_I2C2
) ?ST_I2C1
)Does not block idle (but can still increase power usage) with no process using it / keeping it open:
drm
, omapdrm
, panel_sony_acx565akm
gpio_keys
wl1251
, wl1251_spi
(with wlan0 interface up)pvrsrvkm_omap3_sgx530_121
(with /usr/bin/pvrsrvinit
called, drm and omapdrm loaded)bq2415x_charger
/ bq27xxx_battery_i2c
/ bq27xxx_battery
musb_hdrc
/ omap2430
/ phy_twl4030_usb
ir_rx51
twl4030_charger
, twl4030_pwrbutton
, twl4030_keypad
, twl4030_madc
pwm_twl_led
, pwm_twl
st_sensors
/ st_accel_i2c
/ st_accel
For the record I can hit OFF mode without any patches in Linux 5.7, at 0.011A @ 3.8V. So there's also a regression since then that prevents it from entering OFF mode without changing the timings.
5.9 seems to no longer hit OFF mode for me, I'd have to re-check 5.8 to see if it still works there.
With commit fb2c599f056640d289b2147fbe6d9eaee689f1b2 reverted on 5.15.y at least the instability problems are gone.
Another thing, when n900-powermanagement is started, gpio_keys starts acting up and reports spurious events for all gpio keys (1 and then 0).
With 5.8.y (stable patches) I can still hit off mode, although it's a bit less stable and I have to do it this way:
mount -t proc none /proc
mount -t sysfs none /sys
mount -t debugfs none /sys/kernel/debug
mount -o rw,remount /
echo 1 > /sys/kernel/debug/pm_debug/enable_off_mode
modprobe panel-sony-acx565akm
echo 0 > /sys/class/backlight/acx565akm/brightness
consoles=$(find /sys/bus/platform/devices/4*.serial/ -name console)
for console in ${consoles}; do
echo N > ${console}
done
# Enable autosuspend
uarts=$(find /sys/bus/platform/devices/4*.serial/power/ -type d)
for uart in ${uarts}; do
echo 2000 > ${uart}/autosuspend_delay_ms
echo enabled > ${uart}/wakeup
echo auto > ${uart}/control
done
# Configure wake-up from suspend
uarts=$(find /sys/class/tty/tty[SO]*/power/ -type d)
for uart in ${uarts}; do
echo enabled > ${uart}/wakeup
done
During bisect to find out when off mode stopped working between v5.8 and v5.9 I found that the first commit I hit (g47ec5303d73e
) works extremely well when it comes to idle behaviour -- in the sense that it can stay in OFF mode for basically a minute or longer.
root@(none):/# grep ^core_pwrdm /sys/kernel/debug/pm_debug/count | cut -d',' -f2,3
OFF:8,RET:3
root@(none):/# grep ^core_pwrdm /sys/kernel/debug/pm_debug/count | cut -d',' -f2,3
OFF:10,RET:3
Sent to the lists:
Hi,
I've spent the day bisecting what exact commit prevented the Nokia N900
from entering the OFF sleep state (between v5.8 and v5.9), and it this
commit:
> # first bad commit: [facdaa917c4d5a376d09d25865f5a863f906234a] mm: proactive compaction
The git tree prior to that commit can idle at about ~27mW in OFF mode,
and it will often remain in that mode for prolonged amounts of time
(easily 30 seconds, depending on running userspace). Which the above
commit applied, the Nokia N900 almost never hits OFF mode any more. This
would suggest at least to disable CONFIG_COMPACTION, perhaps in
omap2plus_defconfig? I suspect this might cause idle problems beyond the
Nokia N900, too.
Maybe nothing needs to be done here other than disable the config option
-- but I wanted to share this in case others are trying to figure out
what happened to their battery life.
There seem be more power regressions since then (at least on 5.15 there
is more blocking proper idle), so I'll try to find those as well, but if
this commit is reverted (or CONFIG_COMPACTION=n is in .config - probably
easier) on top of v5.9 the system seems to idle fine.
> # grep ^core_pwrdm /sys/kernel/debug/pm_debug/count | cut -d',' -f2,
> OFF:16,RET:2
Hope this helps someone...
Regards,
Merlijn
PS: v5.10 seems to use another 19mW if panel_sony_acx565akm is loaded
even when display is not active (maybe it doesn't suspend or something?
- could be fixed later, just noticed it for v5.10). I load it initially
to idle the display, but until I rmmod the modules, the module uses
quite a bit more power. This problem is not present in v5.9, so that is
another thing to chase down I guess... And then v5.15 uses another 12mW
more, for not yet uncovered reasons)
Hi Sebastian,
I don't know if this is something that requires any action currently,
but I wanted to report that I'm seeing some increased power draw on a
Nokia N900 with minimal userspace on Linux 5.10 (and the same happens on
5.15 it seems, so it doesn't seem to be resolved since). I tried to
bisect the problem but my initial attempt failed, because the problem
seems a bit racy or unpredictable.
Basically I boot a system to init=/bin/bash and run the following:
> modprobe panel-sony-acx565akm
>
> mount -t proc none /proc
> mount -t sysfs none /sys
> mount -t debugfs none /sys/kernel/debug
> mount -o rw,remount /
>
> echo 1 > /sys/kernel/debug/pm_debug/enable_off_mode
> echo 0 > /sys/class/backlight/acx565akm/brightness
>
>
> consoles=$(find /sys/bus/platform/devices/4*.serial/ -name console)
> for console in ${consoles}; do
> echo N > ${console}
> done
>
> # Enable autosuspend
> uarts=$(find /sys/bus/platform/devices/4*.serial/power/ -type d)
> for uart in ${uarts}; do
> echo 2000 > ${uart}/autosuspend_delay_ms
> echo enabled > ${uart}/wakeup
> echo auto > ${uart}/control
> done
>
> # Configure wake-up from suspend
> uarts=$(find /sys/class/tty/tty[SO]*/power/ -type d)
> for uart in ${uarts}; do
> echo enabled > ${uart}/wakeup
> done
This loads the panel and then sets the brightness to zero, enables off
mode and idles the kernel console/serial.
Then run the following to check idle states:
grep ^core_pwrdm /sys/kernel/debug/pm_debug/count | cut -d',' -f2,3
And also check the power usage on lab power supply that I have here.
With the above, Linux v5.9 (no patches applied) idles at around 42mW
(15mW goes to the serial device, so it's more like 27mW, anyway...).
Linux v5.10 with the following two commits reverted (otherwise the
system is not stable):
* fb2c599f056640d289b2147fbe6d9eaee689f1b2 (ARM: omap3: enable off mode
automatically)
* 21b2cec61c04bd175f0860d9411a472d5a0e7ba1 (mmc: Set
PROBE_PREFER_ASYNCHRONOUS for drivers that existed in v4.4)
And the following config change on top of omap2plus_defconfig (to make
OFF mode work on v5.10 as detailed in "Nokia N900 not hitting OFF mode
since 5.9 is caused by proactive memory compaction"):
> sed -i 's/CONFIG_COMPACTION=y/CONFIG_COMPACTION=n/' .config
Idles at much more -- 60mW (compared to 42mW). Executing "rmmod
panel-sony-acx565akm" makes the power draw return to v5.9 levels.
I don't really understand why this would happen, and as stated before
wasn't able to really bisect the problem. However, some simple guesswork
led me to find that reverting 7c4bada12d320d8648ba3ede6f9b6f9e10f1126a
("drm/panel: sony-acx565akm: Fix race condition in probe") makes v5.10
idle at 42mW again. I don't know if this because v5.9 never properly
initialised the panel, or because the race condition fix introduced
another problem that leaves the hardware in an abnormal state.
Any hints on what could cause this extra power draw? Maybe the panel is
waiting for something? I suppose it's potentially feasible that with
more modules and userspace loaded the panel idles properly, but I
currently don't have a way to measure that.
Regards,
Merlijn
PS: For both v5.9 and v5.10 kernels the only other change to
omap2plus_defconfig is to make the watchdog(s) built-in.
For the record for my bisect tests I used this every step for v5.9..v5.10:
v5.10:
git merge-base --is-ancestor 21b2cec61c04bd175f0860d9411a472d5a0e7ba1 HEAD && git cherry-pick f1e1be898042aff9be3e17c6c1e77513b52e4c4d --no-commit
git merge-base --is-ancestor fb2c599f056640d289b2147fbe6d9eaee689f1b2 HEAD && git cherry-pick 3992aa31bffa73683089d86b5fad3315e3c17fcd --no-commit
The commits that get cherry-picked are simply reverts.
For v5.10..v5.11:
git merge-base --is-ancestor 7c4bada12d320d8648ba3ede6f9b6f9e10f1126a HEAD && git cherry-pick 56a6732102e847a3c3b6f40f8594c69a226fd709 --no-commit
git merge-base --is-ancestor fb2c599f056640d289b2147fbe6d9eaee689f1b2 HEAD && git cherry-pick 3992aa31bffa73683089d86b5fad3315e3c17fcd --no-commit
git merge-base --is-ancestor 21b2cec61c04bd175f0860d9411a472d5a0e7ba1 HEAD && git revert 21b2cec61c04bd175f0860d9411a472d5a0e7ba1 --no-commit
All three are reverts.
Hi Tony, Adam,
I noticed that after I fixed the OFF mode regression between v5.9 and
v5.10 that there are another one between v5.10 and v5.11. Fortunately,
much like the other change it can be worked around with a config change,
and in fact it looks like the commit identified by git bisect is indeed
just a commit to change omap2plus_defconfig.
a82820fcd079e38309403f595f005a8cc318a13c ("ARM: omap2plus_defconfig:
Enable OMAP3_THERMAL") prevents the N900 from entering OFF mode pretty
much all the time (I've seen scenarios with OFF:2,RET:500), but with the
config change reverted, stuff like this is more common: OFF:13,RET:2
We will probably to keep the thermal features enabled, but maybe we can
figure out why it causes the SoC to not enter sleep modes?
The good news is that this seems to be one of the last regressions with
regards to OFF mode (there might be smaller ones that cause slightly
more wakeups, but those will be harder to find). With this
(CONFIG_OMAP3_THERMAL) config option disabled as well; as fixes from my
other recent emails I can get my 5.15 branch to enter OFF mode again:
> # uname -a
> Linux (none) 5.15.2-00597-g68be8fac7cbd #48 SMP PREEMPT Sat Dec 11 00:14:05 CET 2021 armv7l GNU/Linux
> # grep ^core_pwrdm /sys/kernel/debug/pm_debug/count | cut -d',' -f2,
> OFF:13,RET:10
Regards,
Merlijn
So we can hit OFF mode now on 5.15 with some patches. Not in full GUI mode, yet. There are also some stability problems when idling. I think once the stability problems we can start looking at idling individual subsystems?
Something that would be interesting to do regarding OFF mode tests is to parse lsmod on a running GUI system and then boot to init=/bin/bash and insert the modules, one at a time, sleeping in between, and figuring out which ones block off/ret.
Any updates?
Creating this ticket by popular demand, also with some description from IRC