linrunner / TLP

TLP - Optimize Linux Laptop Battery Life
https://linrunner.de/tlp
GNU General Public License v2.0
2.51k stars 129 forks source link

Race between `DEVICES_TO_ENABLE_ON_LAN_DISCONNECT` and `DEVICES_TO_DISABLE_ON_BAT_NOT_IN_USE` #667

Closed real-or-random closed 1 year ago

real-or-random commented 1 year ago

[x] I've read and accepted the Bug Reporting Howto [x] I've provided all required tlp-stat outputs via Gist (see below)

Describe the bug

I had a setup where both DEVICES_TO_ENABLE_ON_LAN_DISCONNECT and DEVICES_TO_DISABLE_ON_BAT_NOT_IN_USE contain wifi. Now when I unplug my Thunderbolt dock, LAN (on the dock!) is disconnected and the system switches to BAT.

What now happened is that wifi gets enabled by DEVICES_TO_ENABLE_ON_LAN_DISCONNECT, and tries to connect. But before it manages to establish a connection, it's disabled again by DEVICES_TO_DISABLE_ON_BAT_NOT_IN_USE.

Expected behavior

DEVICES_TO_DISABLE_ON_BAT_NOT_IN_USE does not disable wifi when it has just been enabled.

Maybe this could be done via a small delay. (But maybe this ugly and/or creates races?)

Or it could possibly be done by having a nicer way of checking the "IN USE" status of wifi. For example, when checking via nm-cli, it should be able to figure out that the device is trying to establish a connection. (But that requires nm.)

For now, I simply removed wifi from DEVICES_TO_DISABLE_ON_BAT_NOT_IN_USE...

Additional context

Debug log of issue ``` Dec 30 10:08:01 tlp[23555]: +++ rdw_nm(enp165s0).down Dec 30 10:08:01 tlp[23555]: rdw_nm(enp165s0).down: type=ethernet [saved] Dec 30 10:08:01 tlp[23555]: device_switch(wifi, on, rdw_nm_wifi, 2) Dec 30 10:08:01 tlp[23555]: get_devc(wifi) = /sys/class/rfkill/rfkill0/state Dec 30 10:08:01 tlp[23555]: get_devs(wifi) = 0 Dec 30 10:08:01 tlp[23555]: device_switch(wifi, on).rfkill Dec 30 10:08:01 tlp[23555]: get_devs(wifi) = 1 Dec 30 10:08:01 tlp[23555]: invoke_nmcli(wifi, on).radio: rc=0 Dec 30 10:08:01 tlp[23555]: get_devs(wifi) = 1 Dec 30 10:08:01 tlp[23555]: device_switch(wifi, on).ok: rc=0 Dec 30 10:08:01 tlp[23585]: set_radio_device_states(1): enable= disable=wwan wifi bluetooth Dec 30 10:08:01 tlp[23585]: device_switch(wwan, off, , ) Dec 30 10:08:01 tlp[23585]: get_devc(wwan).not_present Dec 30 10:08:01 tlp[23585]: device_switch(wwan, off).no_device: rc=1 Dec 30 10:08:01 tlp[23585]: device_switch(wifi, off, , ) Dec 30 10:08:01 tlp[23585]: get_devc(wifi) = /sys/class/rfkill/rfkill0/state Dec 30 10:08:01 tlp[23585]: get_devs(wifi) = 1 Dec 30 10:08:01 tlp[23585]: device_switch(wifi, off).rfkill Dec 30 10:08:01 tlp[23585]: get_devs(wifi) = 0 Dec 30 10:08:01 tlp[23585]: invoke_nmcli(wifi, off).radio: rc=0 Dec 30 10:08:01 tlp[23585]: get_devs(wifi) = 0 Dec 30 10:08:01 tlp[23585]: device_switch(wifi, off).ok: rc=0 Dec 30 10:08:01 tlp[23585]: device_switch(bluetooth, off, , ) Dec 30 10:08:01 tlp[23585]: get_devc(bluetooth) = /sys/class/rfkill/rfkill1/state Dec 30 10:08:01 tlp[23585]: get_devs(bluetooth) = 0 Dec 30 10:08:01 tlp[23585]: device_switch(bluetooth, off).desired_state Dec 30 10:08:01 tlp[23585]: device_switch(bluetooth, off).no_change: rc=4 ```
linrunner commented 1 year ago

Hi,

the scenario here is by no means clear or simple. The race is caused by the fact that your settings contain contradictory instructions for TLP.

Which setting - DEVICES_TO_ENABLE_ON_LAN_DISCONNECT or DEVICES_TO_DISABLE_ON_BAT_NOT_IN_USE - has priority? How is TLP supposed to decide what the user genuinely wants? Do all users have the same expectation about priority?

Delays or locking or querying NM might produce new races and do not answer the question what the user wanted. Also it is not certain that the chronological order of the two events is always the same.

real-or-random commented 1 year ago

the scenario here is by no means clear or simple.

I agree.

Closing because of your "won't fix" decision.