kdave / btrfsmaintenance

Scripts for btrfs maintenance tasks like periodic scrub, balance, trim or defrag on selected mountpoints or directories.
GNU General Public License v2.0
900 stars 79 forks source link

Better support for scheduling wrt AC power #42

Open kdave opened 6 years ago

kdave commented 6 years ago

This is forked from discussion in issue #29, cc @sten0

comio commented 6 years ago
  • ConditionACPower=yes in the timer units, because maintenance operations will needlessly deplete battery life on laptops, and servers that are on emergency UPS power should probably defer as well
  • Combined with anachron-like behaviour so that these tasks will run when AC power is restored

from my understanding, systemd silently skips the timer. A trick is to create a power-ac target and a battery target and inject the power-ac dependency. This is not obvious and we need also to add udev rules in order to activate the targets on demand.

I will check for deferred task scheduling... "Persistent=true" can be useful, I think and it is already used.

Persistent=

Takes a boolean argument. If true, the time when the service unit was last triggered is stored on disk. When the timer is activated, the service unit is triggered immediately if it would have been triggered at least once during the time when the timer was inactive. This is useful to catch up on missed runs of the service when the machine was off. Note that this setting only has an effect on timers configured with OnCalendar=. Defaults to false.

sten0 commented 6 years ago

Hi Comio,

Thank you for all your work on this, and thank you for looking into deferred task scheduling! I had been planning to submit a naïve trivial patch adding ConditionACPower=yes. Do you know if it's silently skipped for all distributions, or if some already support the power-ac and battery targets?

Cheers, Nicholas

On 19 January 2018 at 05:18, comio notifications@github.com wrote:

  • ConditionACPower=yes in the timer units, because maintenance operations will needlessly deplete battery life on laptops, and servers that are on emergency UPS power should probably defer as well
  • Combined with anachron-like behaviour so that these tasks will run when AC power is restored

from my understanding, systemd silently skips the timer. A trick is to create a power-ac target and a battery target and inject the power-ac dependency. This is not obvious and we need also to add udev rules in order to activate the targets on demand.

I will check for deferred task scheduling...

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kdave/btrfsmaintenance/issues/42#issuecomment-358924525, or mute the thread https://github.com/notifications/unsubscribe-auth/AK2HAe2Yigo7rEsxIrhQR0bI2MJKmcShks5tMGv_gaJpZM4Rep5o .

comio commented 6 years ago

Hi @sten0 ,

ConditionACPower=yes will just skip the trigger if you are on battery. I think that is not good for laptops because will be an high probability to skip indefinitely the job. Regarding the power-ac and battery target... at this time these are distro specific. :/

comio commented 6 years ago

I had this idea:

caveat: how we can check the ac power in a portable way?

cat /sys/class/power_supply/*/status

something like this could help

while [ $(cat ${STATUS_FILE}) != ${AC_POWER_STATUS} ]; do
   sleep ${SLEEP_TIMEOUT};
done;
sten0 commented 6 years ago

That looks ok to me, and your method also accommodates those who don't use systemd ;-) That said, the user who submitted a Debian bug report I received titled "Blindly assumes systemd" already has his own scripts, so I wonder systemd and openrc users are not the target audience...

Is there already a queuing mechanism that will prevent defrag, balance, and scrub from running simultaneously when the laptop is plugged in? One thing I like about the systemd approach it is supposed to allow stuff like defrag.service:Before=balance, balance.service:Before=scrub, which ought to prevent three IO intensive jobs from being run at once. Assuming these actually work, it seems like an elegant approach. It's a shame there isn't a ConditionACPower=yes||wait/sleep...

Also "cat /sys/class/power_supply/*/status" might be better as: cat /sys/class/power_supply/AC/online (prints 1 when there is AC and 0 when unplugged) because I've seen cat /sys/class/power_supply/BAT?/status print Charging, Discharging, and Unknown.

comio commented 6 years ago

ConditonACPower is just a "skip-only" condition and cannot help. systemd will never add a "ConditionACPower=wait" or similar. To avoid multiple run (but this is another question), I usually create a lock file/directory for the resource and wait on it.

comio commented 6 years ago

Can we add this helper function to btrfsmaintenance-functions:

# function: wait_ac_power
# parameter: timeout
#
# wait until ac power goes online
wait_ac_power() {
        local timecount=0
        local timeout=0
        [ ! -z "$1" ] && timeout=$1

        if [ -f /sys/class/power_supply/AC/online ]; then
                while [ $(cat /sys/class/power_supply/AC/online) -ne 1 ]; do
                        # AC is not online
                        [ $timeout -gt 0 ] && [ $timecount -ge $timeout ] && return 1
                        sleep 1s
                        timecount=$((timecount+1))
                done
                return 0
        fi
        return 0
}

I will prepare a PR soon

comio commented 6 years ago

See PR #45

Please review the code with your contributions.

sten0 commented 6 years ago

Continuing discussion from PR #45 :

@kdave wrote:

I'm not sure if the on_ac_power is generally available, so we'll need some fallback anyway, but systemd has /usr/lib/systemd/systemd-ac-power so we can cover most cases I think.

Systemd 232 (Debian 9/stable/stretch) doesn't seem to have systemd-ac-power, but 237 (available in backports) does. Also Ubuntu's 18.04 LTS release ships with systemd 234 which also provides it. Fedora is of course new enough, and RHEL disavowed btrfs support (does this mean CentOS too?). How is the support for systemd-ac-power on SLED and Leap?

So the way you implement it in 14c44e9 it will just wait for AC and if it does not show up, the task continues. Is this desired from the user's POV? Eg. should we make it more configurable:

if AC is not up after timeout, cancel the task (eg. skip balance) if AC is not up, continue anyway (eg. run scrub that's read-only and typically not that hungry as balance)

To reduce the number of configurable keys, but provide operation-specific granularity, maybe something with these? : SCRUB_ON_BATTERY_OR_UNKNOWN_AC_STATE=(yes || no || poll || wait) BALANCE_ON_BATTERY_OR_UNKNOWN_AC_STATE=(yes || no || poll || wait) DEFRAG_ON_BATTERY_OR_UNKNOWN_AC_STATE=(yes || no || poll || wait)

"Wait" can be implemented later (should be event-driven), and I believe it's important that a "poll" option is explicit (eg: if btrfsmaintenance will poll, the key value should be called "poll"), so that users can make an informed decision whether or not they want something waking up their CPU until AC is restored. IMHO, in 2018, polling is poor design... Given the available solutions, yes || no should be sufficient. "Wait" might not be necessary either. I think the operation should default to not continuing for ON_BATTERY or UNKNOWN_AC_STATE, because running a scrub, balance, or defrag that might not complete before the battery runs out, resulting in one of 1) hibernation 2) shutdown 3) unlucky catastrophic loss of power (eg: a battery pack with one bad cell), or because of an unknown AC state.

@comio wrote

ConditonACPower is just a "skip-only" condition and cannot help. systemd will never add a "ConditionACPower=wait" or similar. To avoid multiple run (but this is another question), I usually create a lock file/directory for the resource and wait on it.

Agreed, let's eliminate ConditionACPower=wait as a possibility. I just found the following upstream issue, which may or may not still be current https://github.com/systemd/systemd/issues/5969 If /usr/lib/systemd/systemd-ac-power is also not an option, here are four others that don't require reimplementing existing interfaces:

  1. anacron (should be configured to not run anything when on battery), but this neither cron nor systemd... IMHO this would be preferable to existing cron support, but systemd systems typically don't have it installed. If it is configured by the distribution to use 3, then it might not support UPSs.
  2. use a udev rule. @comio proposed this Jan 19th (whoever is packaging btrfsmaintenance for his/her distribution can coordinate with the appropriate team to provide this, and users who install from the github repository can copy an suggested example into place). This seems like a nice distribution-agnostic event-driven approach. No idea if it supports UPSs...
  3. Configurable key to call distribution-provided method. eg: Debian, Ubuntu, and derivatives have long provided /usr/bin/on_ac_power. That returns 0 for mains power, 1 when not on mains power, and 255 for indeterminate state. All distributions that don't use /usr/lib/systemd/systemd-ac-power should provide such a method. Definitely doesn't support UPSs.
  4. UPower? I think this is the only one that reliably supports UPS power status, which will eliminate the incidence of indeterminate AC states. Backwards compatible and distribution agnostic. The following project might be useful as a model: https://github.com/dywisor/batwatch

At any rate, I look forward to being able to responsibly enable scheduled scrubs by default in Debian's package once this issue is resolved :-) Thank you for working on it! @kdave what do you think of the UPower possibility? What method does SLED use to skip jobs when on UPS power?

fcrozat commented 6 years ago

openSUSE Leap 15 / SLE 15 (both SLES and SLED) are / will be shipping with systemd 234, systemd-ac-power is available there. There is no real point to supporting older Leap / SLE for this particular feature.

SLES and SLED 15 are similar regarding power handling. upower is available on both but will be installed by default only for graphical desktop installation. A pure text-only install of SLES will not install upower.

comio commented 6 years ago

Hi All,

I checked the sources of systemd-ac-power and they implement a very simple logic.

For this reason I backported into PR#45 the OpenRC script that does the same job. Doing this we haven't any dependency from systemd.

Regarding poll strategy, I can investigate for an implementation but I'm little busy for my job. BTW you can modify my branch (comio:wait_ac_power) to try new solutions.

Ciao

luigi

Il giorno gio, 24/05/2018 alle 07.19 +0000, Frederic Crozat ha scritto:

openSUSE Leap 15 / SLE 15 (both SLES and SLED) are / will be shipping with systemd 234, systemd-ac-power is available there. There is no real point to supporting older Leap / SLE for this particular feature. SLES and SLED 15 are similar regarding power handling. upower is available on both but will be installed by default only for graphical desktop installation. A pure text-only install of SLES will not install upower. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

sten0 commented 6 years ago

On 24 May 2018 at 03:19, Frederic Crozat notifications@github.com wrote:

openSUSE Leap 15 / SLE 15 (both SLES and SLED) are / will be shipping with systemd 234, systemd-ac-power is available there. There is no real point to supporting older Leap / SLE for this particular feature.

SLES and SLED 15 are similar regarding power handling. upower is available on both but will be installed by default only for graphical desktop installation. A pure text-only install of SLES will not install upower.

@fcrozat, thank you for confirming this! Does SLES have a default system-level method for handling UPS events, or are apcupsd or Network UPS Tools (NUT) still needed? Is NUT installed by default? From what I can tell the advantage of using upower is that it provides an event-driven interface that can be relied on to provide AC||battery state for both laptops and UPS-connected servers. Alternatively, I've figured out a way to handle the laptop case without polling, NUT provides a nice event-driven interface, but it seems to still be necessary to fall back to polling for apcupsd. So depend on upower in btrfsmaintenance vs write three methods for three different interfaces, and where apcups will require polling--which is probably ok. @kdave, what do you think? I'm willing to work on whatever you decide is most correct.

Regarding minimising the maintenance burden in btrfsmaintenance, it sounds like systemd-ac-power can be depended on for reasonably up-to-date systemd systems, and distributions that support non-systemd inits tend to provide equivalent functionality with /usr/bin/on_ac_power in a package named something like powermgmt-base. @comio, do you know of any default cases where openRC is installed, but powermgmt-base is not? If so, does openRC provide its own helper script? @kdave, or would you prefer that btrfsmaintenance reimplements this logic as Comio suggests in this PR? @kdave, if you advocate depending on systemd-ac-power, on_ac_power, or the NUT or apcupsd equivalent, would you like to see this as a config key, or as a case structure in btrfs-maintenance-functions?

@comio, thanks again for your work on this issue so far! :-)

fcrozat commented 6 years ago

AFAIK, there is no system-level method to handled UPS events. Both apupsd and nut are available on SLES but not installed by default.

GuillaumeSeren commented 5 years ago

Hey there, I have been using btrfsmaintenance on desktop/server and now I want to use it on my laptop too, I have finally find this issue and the patches from @comio (Thank you).

I rebased against 0.4.1 and applyed the patches in my portage, Ideally I would like the operation to wait to be on ac to start the patch, this is done by the patch from @comio with a timer.

But I think there is an other part not really covered by this, like you are in ac and a task started (maybe even multiple task), but you don't notice it, and you unplugged the ac, I would like to pause the runnning process, and resume them when on back on ac, I know this seems not easy but btrfs support that so I think it could be great to have this.

sten0 commented 4 years ago

@kdave, so, would you like me work on adding event-driven upower support, plus a config key (disabled by default). Also, would you like btrfsmaintenance to store it's state or unconditionally react to these upower events? The intent [edit: is] to DTRT when a laptop is unplugged, replugged, unplugged mid-operation, etc, and also to conserve power when a server goes to backup power (UPS).