linrunner / TLP

TLP - Optimize Linux Laptop Battery Life
https://linrunner.de/tlp
GNU General Public License v2.0
2.75k stars 130 forks source link

Stop threshold not working any more on LG Gram [Debian Sid, kernel 6.9.7] #747

Closed iacchi closed 2 months ago

iacchi commented 4 months ago

[x] I've read and accepted the Bug Reporting Howto [x] I've provided all required tlp-stat outputs via Gist (see below)

Describe the bug

I'm not quite sure if this is more of a TLP bug or a support request for something else in the system, but here I go. My laptop is running Debian Sid with kernel 6.9.7 and with secure boot enabled.

After a recent apt upgrade, I'm getting a message to disable secure boot to make third party drivers work in my system (I never had an issue so far for the year that I've had this laptop, and the dkms signing key is registered in the system so that drivers signed by dkms are accepted), which I'm refusing. More or less at the same time (not 100% sure), the battery charge limit to 80% stopped working. I used to set this threshold in KDE's system settings (which I think relies on TLP), but now my 80% value gets overridden to 50% in the KDE battery settings and ignored anyway as the battery gets charged to 100%. I've manually enabled the tlp service with systemctl and it's working correctly, however tlp-stat tells me that my system is not supported (but it was until a few days ago), here is the output of tlp-stat: https://pastecode.io/s/18fx72dk

If I try to manually edit /sys/devices/platform/lg-laptop/battery_care_limit with nano, I get this warning message in the text editor (as root of course):

[ Errore durante la scrittura del file di lock /sys/devices/platform/lg-laptop/.battery_care_limit.swp: Permesso negato ]

I'm sorry it's in Italian, it translates to something like: Error during writing lock file /sys/[...]: Permission denied.

If I try to input 80 and save the file, when I quit nano the value inside the file is still 0.

At this point I'm not quite sure if something has changed tlp-wise, kernel-wise, or what else. I'm happy to help debug to figure this out. I also have slimbook battery installed, I'm not sure if it's relevant or not.

linrunner commented 4 months ago

I used to set this threshold in KDE's system settings (which I think relies on TLP)

No, it doesn't. It directly writes to charge_control_end_threshold (see below).

I also have slimbook battery installed, I'm not sure if it's relevant or not.

The FAQ says

Slimbook Battery: uses TLP as a backend to apply power saving measures. However, it continuously overwrites your TLP configuration. If you wish to configure TLP individually, you must first uninstall Slimbook Battery.

Normally there would be no support for your constellation here.

However, I am interested in why the stop threshold cannot be written. For kernel 6.9 /sys/devices/platform/lg-laptop/battery_care_limit is no longer used for this but /sys/class/power_supply/BAT0/charge_control_end_threshold.

Please show the output of

grep . /sys/class/power_supply/BAT0/charge_control*

echo "80" | sudo tee /sys/class/power_supply/BAT0/charge_control_end_threshold

echo "80" | sudo tee /sys/devices/platform/lg-laptop/battery_care_limit

Btw: editing with nano is not suitable for changing sysfs nodes.

iacchi commented 4 months ago

Hello, and thank you for your reply. I turned off slimbook battery for the moment while we try to solve this. Also good to know that nano is not the way to go in this case - I didn't know that. Anyway, here's the output you asked for:

root@hactar:/usr/lib/modules# grep . /sys/class/power_supply/BAT0/charge_control*
0
root@hactar:/usr/lib/modules# echo "80" | tee /sys/class/power_supply/BAT0/charge_control_end_threshold
80
root@hactar:/usr/lib/modules# cat /sys/class/power_supply/BAT0/charge_control_end_threshold
0
root@hactar:/usr/lib/modules# echo "80" | tee /sys/devices/platform/lg-laptop/battery_care_limit
80
root@hactar:/usr/lib/modules# cat /sys/devices/platform/lg-laptop/battery_care_limit
0

It looks like the files are not being written... I've also tried to restart tlp.service and to input 80 in KDE's systemsettings, but both files are still at 0. Should I try to boot with kernel 6.8.12 (I still have it installed) to see if it goes back to working as usual? I understand that there was a change between the 6.8 and 6.9 kernel series?

linrunner commented 4 months ago

Of course. I assumed right from the start that this is a kernel issue. The commands were used to work that out. Whether and what has been changed? I have no idea.

The other possibility would be a recent BIOS update.

iacchi commented 4 months ago

It's definitely not a BIOS update, as there haven't been any for this laptop. I rebooted back to kernel 6.8.12 and now everything works again straight after reboot, without even needing to re-set the values:

root@hactar:/home/iacopo# cat /sys/devices/platform/lg-laptop/battery_care_limit
80
root@hactar:/home/iacopo# cat /sys/class/power_supply/BAT0/charge_control_end_threshold
80

So definitely there's something in kernel 6.9.7 that is messing this up. I'm unsure if I should continue debugging with you or if I should open a bug with Debian or with Linux directly?

linrunner commented 4 months ago

I'm unsure if I should continue debugging with you or if I should open a bug with Debian or with Linux directly?

There is nothing more to debug on the kernel level that I could help with. You may continue with a Debian bug report.

Just for the record, I would like to have the output of

sudo tlp-stat -s -b

with the 6.8 kernel.

iacchi commented 4 months ago

Ok then, I'll keep you posted (and link the Debian bug report) in case there's anything you need to know for the future! In the meantime, here's the output you asked for: https://pastecode.io/s/1mvfpyjk

iacchi commented 4 months ago

This is the bug opened with the Debian team, in case someone wants to follow it: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1076110

tixwho commented 4 months ago

Hi there! I'm on Arch with power-profiles-daemon using 2023 ver. LG Gram 17, and I have the same issue after a recent kernel update (can't tell which one exactly, but the most recent kernel Linux 6.9.9-arch-1-1 is not fixing it.) It seems like the issue is caused by a kernel problem independent of distributions. Should there be more testings needed I'll be happy to help!

Emily511511 commented 3 months ago

@linrunner

I am on Debian Sid KDE 5.27 too. I don't have power-profiles-daemon.

I don't understand why the threshold parameter is randomly not honored, but waking up from sleep or even rebooting my laptop (lenovo ideapad) seems to sometimes fix it. It's not uncommon for me to find it not working, AFK, come back at the lock screen and finally see it working properly. It's really strange.

And sometimes it works without a fuss for several days in a row, with no updates in-between.

I am suspecting that there is a conflict with KDE because when it doesn't work, and I hover the battery icon, I can briefly see for less than a second a message akin to "not charging" and then right away it says "charging"

It's either this or something broke in the kernel. Kernel 6.10 doesn't seem to fix it.

The strange thing is that when it doesn't work, systemctl tells me that tlp was running and applied the battery threshold.

computer tlp[1561]: Applying power save settings...done. computer tlp[1561]: Setting battery charge thresholds...done.

(I am commenting on this issue instead of opening a new one, because it seems relevant, please tell me if I should open a new issue instead).

iacchi commented 3 months ago

@Emily511511 this bug is specifically for issues related to the LG Gram laptop. Since you have a Lenovo, maybe better open a new one?

In the meantime, I have some updates: I've contacted the kernel module maintainer and he said everything works for him on Fedora and a 2017 laptop. He gave me the source of the lg_laptop module present in kernel version 6.8.12 and told me to compile it and use it instead of the one already present in the Debian kernel. It took me a while to compile it (compiling issues) and sign it, but I managed. So far, the issue remains with this version of the module as well on kernel 6.9.12 (Sid has updated the kernel in the meantime). I'll keep you posted. @tixwho if you want to give me your email address I can add you to the email chain of my conversation with the kernel module maintainer now that I'm going to reply to him.

linrunner commented 3 months ago

@Emily511511 : your post is offtopic here, please file your own issue and remember to show the complete output of

sudo tlp-stat -s -b -c
linrunner commented 2 months ago

@iacchi any news on this?

iacchi commented 2 months ago

I've managed to convince the kernel module author that it's a general problem, because I could reproduce the issue on Fedora as well. The problem is that everything works on his older laptop, so now it's up to me to do the debugging by running git bisect on the kernel repo and a looooot of compiling to find the offending commit, so that he can address the issue. This will take a while because my time now is limited due to work reasons for the next couple of months, and as you can imagine compiling and testing X number of kernels is not quick. I'll keep you all posted. Since it's definitely not a TLP issue, it's up to you to decide if you want to keep this issue open just to receive updates or if you want to close it.

linrunner commented 2 months ago

A tough piece of work for you. Good luck.

I'll keep this open for visibility.

alexyao2015 commented 2 months ago

For what it's worth, I'm running stock Ubuntu 24.04 on 6.8.0 and am seeing this issue as well. I'll try upgrading and seeing if it fixes the issue.

iacchi commented 2 months ago

For what it's worth, I'm running stock Ubuntu 24.04 on 6.8.0 and am seeing this issue as well. I'll try upgrading and seeing if it fixes the issue.

That's weird, I have no issue with the 6.8 kernel series. This may be worth investigating. All more recent kernels are broken afaik.

linrunner commented 2 months ago

@iacchi Ubuntu backports fixes from 6.9 ff. to their 6.8.

Mylinde commented 2 months ago

One short question, 6.8 and 6.9 are EOL! So what need to fix something?

iacchi commented 2 months ago

One short question, 6.8 and 6.9 are EOL! So what need to fix something?

The regression for me happened between 6.8.12 and 6.9.7, and it persists also in current kernels, so it's important to figure out where the regression started.

iacchi commented 2 months ago

@iacchi Ubuntu backports fixes from 6.9 ff. to their 6.8.

I'm not sure I understood this sentence. What happened in Ubuntu again?

EDIT: never mind, I think I understood. Ubuntu implemented some stuff from 6.9 in the 6.8 line, did I get it right? Hence breaking it for 6.8 as well.

Mylinde commented 2 months ago

One short question, 6.8 and 6.9 are EOL! So what need to fix something?

The regression for me happened between 6.8.12 and 6.9.7, and it persists also in current kernels, so it's important to figure out where the regression started.

Have you tried out 6.10.7?

linrunner commented 2 months ago

@iacchi correct. Kernel 6.8 is part of their 24.04 LTS, so they have to maintain it for 5 years which means backporting patches from newer stable kernels.

iacchi commented 2 months ago

Have you tried out 6.10.7?

Not yet, but 6.10.4 was still broken.

Mylinde commented 2 months ago

@linrunner

The ACPI device ID has changed cause this the issue?

https://lore.kernel.org/lkml/99d78b65-2257-ea3d-3368-4e794f68296e@linux.intel.com/

iacchi commented 2 months ago

@linrunner

The ACPI device ID has changed cause this the issue?

https://lore.kernel.org/lkml/99d78b65-2257-ea3d-3368-4e794f68296e@linux.intel.com/

So the driver maintainer knew all along about the issue from other users as well and knew that a possible patch existed and didn't tell me anything. Fun.

iacchi commented 2 months ago

Ok, so, I had a chance to reboot to 6.10.7 and the issue is now fixed, likely thanks to the work in https://bugzilla.kernel.org/show_bug.cgi?id=219075 and maybe https://bugzilla.kernel.org/show_bug.cgi?id=218901

I guess good thing I didn't have the time to start bisecting? But it would have been nice if the driver maintainer could have pointed me to those couple of bugs...

alexyao2015 commented 2 months ago

Well that's great! I guess it's time to upgrade kernel.

Side note, did you experience any issues with the brightness media keys taking forever to register after pressing them? Volume media keys seem to work fine but the brightness ones specifically appear to take 10-15 seconds after pressing to register.

iacchi commented 2 months ago

Side note, did you experience any issues with the brightness media keys taking forever to register after pressing them? Volume media keys seem to work fine but the brightness ones specifically appear to take 10-15 seconds after pressing to register.

No, mine are immediate. I've read in the various discussions something along those lines, though. Check your dmesg and ACPI errors - people on reddit were suggesting to disable a kernel parameter. I think the post was this one: https://www.reddit.com/r/linuxhardware/comments/x97m6l/comment/j2r7irr/

alexyao2015 commented 2 months ago

Figured out the issue was caused by acpi_mask_gpe=0x6E in my grub cmdline. I had initially added it there from reading reddit about acpi errors, but it seems like causes problems.

linrunner commented 2 months ago

@iacchi

Ok, so, I had a chance to reboot to 6.10.7 and the issue is now fixed,

Note to self: add 6.10.7 to the FAQ and BCVS.

But it would have been nice if the driver maintainer could have pointed me to those couple of bugs...

Oh yeah.

alexyao2015 commented 2 months ago

Adding that I installed the Ubuntu mainline kernel 6.10.9 which also resolved the issue. Seems likely it was that linked patch which fixed it.

linrunner commented 2 months ago

FAQ updated. https://linrunner.de/tlp/faq/battery.html#lg-laptop-not-supported

iacchi commented 2 months ago

Thank you! Maybe it's time to close this bug at this point?

linrunner commented 2 months ago

Gladly.

arbilgin commented 1 week ago

I have the same issue with LG Gram Ubuntu 24.04 and kernel 6.8.0-48.

iacchi commented 1 week ago

maybe get a more up-to-date kernel from backports or something like that? In newer kernels it's fixed.

arbilgin commented 1 week ago

Would it be a good practice to upgrade the kernel this way?

iacchi commented 1 week ago

I don't use ubuntu, but I'm sure they provide official backports packages for the kernel or something like that; maybe google it. If you get the kernel from official repos it'll be fine.

iacchi commented 1 week ago

we're getting fairly OT, but I had a look for you: Ubuntu 24.04 will probably get a kernel >6.8 with the 24.04.2 version next year. I don't know if the backports repo has the kernel you need (somehow the repo packages website doesn't show any kernel package at all...), but in all cases you'll want to look for the hwe version of the kernel (search in Synaptic). Alternatively, there's an official ppa for the upstream kernel (so, without Canonical's modifications to it) that already runs 6.9 or newer. In case you're ok with that, look for it, install the ppa and the package.

arbilgin commented 1 week ago

My concern is to cause some other problems with such an update. I don't understand why Ubuntu 24 comes with an old kernel.

alexyao2015 commented 1 week ago

It comes with the latest LTS kernel which is missing support for the new laptop.

linrunner commented 1 week ago

@arbilgin Regarding the question of how to get a newer kernel for Ubuntu, please contact an Ubuntu forum. You will find specific help there. This is not a bug in TLP and the TLP issue tracker is not the right place. It is not a general discussion and help forum.

https://ubuntu.com/community/support

Thanks for your understanding.