openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.55k stars 1.74k forks source link

monthly cron scrub like mdadm does #1006

Closed mailinglists35 closed 6 years ago

mailinglists35 commented 12 years ago

that would be cute, wouldn't it?

ryao commented 12 years ago

You could configure this yourself. Just put an entry into the root user's crontab.

behlendorf commented 12 years ago

If we end up doing something like this, it will almost certainly be controlled by the daemon described in issue #2.

FransUrbo commented 10 years ago

@behlendorf Add tag 'zed' to this?

mailinglists35 commented 9 years ago

can this be added now that zed is implemented? :)

behlendorf commented 9 years ago

There was some discussion about how one might go about this. Should the ZED itself contain cron like functionality, that feels like feature creep. Should the ZFS kernel modules periodically create 'scrub' events when it thinks it's appropriate. Maybe based on total IO since last scrub, or when cksum/io errors are detected. Something else.

rlaager commented 8 years ago

I personally think a cron job is sufficient.

Here is the mechanism I wrote for Ubuntu 16.04 (which should make it into Debian eventually, if it hasn't already):

/etc/cron.d/zfsutils-linux

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin

# Scrub the second Sunday of every month.
24 0 8-14 * * root [ $(date +\%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/scrub ] && /usr/lib/zfs-linux/scrub

/usr/lib/zfs-linux/scrub

#!/bin/sh -eu

# Scrub all healthy pools.
zpool list -H -o health,name 2>&1 | \
    awk 'BEGIN {FS="\t"} {if ($1 ~ /^ONLINE/) print $2}' | \
while read pool
do
    zpool scrub "$pool"
done
Greek64 commented 6 years ago

@rlaager Why are you using the "date +%w" command to determine the DayOfWeek? Isn't the 5 Field of the crontab entry exactly for that purpose? Basically you could write: 24 0 8-14 * 0 root [ -x /usr/lib/zfs-linux/scrub ] && /usr/lib/zfs-linux/scrub

I find it interesting, that this was "pushed" so easily into Debian...

rincebrain commented 6 years ago

@Greek64 That also works, but I imagine that it originated from crontab(5), which (on Debian) says, in the examples list:

       # Run on every second Saturday of the month
       0 4 8-14 * *    test $(date +\%u) -eq 6 && echo "2nd Saturday"
Greek64 commented 6 years ago

Oh, that makes sense. Still ugly though...

rlaager commented 6 years ago

This is a similar approach to mdadm. It is necessary because "24 0 8-14 0" does not work the same. From crontab(5), "Note: The day of a command's execution can be specified by two fields — day of month, and day of week. If both fields are restricted (i.e., aren't ), the command will be run when either field matches the current time." (emphasis original).

Greek64 commented 6 years ago

That are details that never seize to amaze me. You live and learn...

behlendorf commented 6 years ago

Closing. The general consensus is that this is a job for cron.

behlendorf commented 6 years ago

@kpande are you thinking of something along the lines of ext4's tune2fs -i interval-between-checks option?

mailinglists35 commented 5 years ago

Can this be revisited, please? When I install mdadm, a cron task is automatically added without user intervention:

$ rpm -ql mdadm|grep cron
/etc/cron.d/raid-check

$ cat /etc/cron.d/raid-check
# Run system wide raid-check once a week on Sunday at 1am by default
0 1 * * Sun root /usr/sbin/raid-check

But when I install zfs, no cron task is automatically added.

rincebrain commented 5 years ago

@mailinglists35 If it bothers you, go make a PR with either rlaager's example script that's installed on Debian systems and is 4 lines, or your alternative?

I don't think anyone's objected to the idea of a cron job for this, AFAICT it got closed because the issue was thought about in terms of having zed or similar trigger it, and the conclusion was that this is probably more a generic cronjob problem.

mailinglists35 commented 5 years ago

@mailinglists35 If it bothers you, go make a PR with either rlaager's example script that's installed on Debian systems and is 4 lines, or your alternative?

I don't think anyone's objected to the idea of a cron job for this, AFAICT it got closed because the issue was thought about in terms of having zed or similar trigger it, and the conclusion was that this is probably more a generic cronjob problem.

I think @rlaager 's /usr/lib/zfs-linux/scrub script and /etc/cron.d/zfsutils-linux file from debian package are just fine, but I don't think I am the one able do the PR properly.

May I kindly ask that they would be included by the authors in the zfs releases? Perhaps there are others more familiar with the code than me that can do it right.

rlaager commented 5 years ago

It’s even easier to delete or disable a cron job you don’t need. But I don’t have a personal interest here, as I use distro packages which already ship this.

mailinglists35 commented 5 years ago

If keeping defaults is desired, would it be ok for the two files to be shipped like in debian, with the modification of the cron file line commented out?

francoism90 commented 5 years ago

I would suggest adding a systemd unit to automatically scrub week/monthly, Btrfs provides them as btrfs-scrub@.

I have created and using the following systemd timer:

# /etc/systemd/system/zfs-scrub@.timer
[Unit]
Description=Weekly zpool scrub on %i

[Timer]
OnCalendar=weekly
AccuracySec=1h
Persistent=true

[Install]
WantedBy=multi-user.target
# /etc/systemd/system/zfs-scrub@.service
[Unit]
Description=zpool scrub on %i

[Service]
Nice=19
IOSchedulingClass=idle
KillSignal=SIGINT
ExecStart=/usr/bin/zpool scrub %i

This allows an user to enable scrubbing for a pool by enabling the zfs-scrub@my-pool.timer.

What are your thoughts? :)

mailinglists35 commented 5 years ago

@francoism90

What are your thoughts?

the discussion is about pursuing ZoL to ship by default an enabled periodic scrub, not about how to technically deliver the checks (cron or systemd timers or else). so far the ZoL authors voices are opposing to this, so unfortunately it's quite irelevant how to technically do it unless they change their opinions :(

rlaager commented 5 years ago

cron vs systemd vs both is another topic entirely.

But, since you posted it, here are a few comments:

  1. IOSchedulingClass=idle is useless and thus misleading. The only thing that is going to do is set the scheduling class for the zpool scrub command that signals the kernel to start the scrub. It's not going to affect the scrub itself. ZFS has separate controls for that.
  2. Likewise for Nice=19.
  3. Is there a particular reason you're setting KillSignal=SIGINT? Does zpool scrub not respond to SIGTERM?
  4. I think a weekly scrub is crazy overkill. Monthly should be more than sufficient.
francoism90 commented 5 years ago

@mailinglists35 Sorry, I misunderstood the discussion. However I do agree about not enabling scrub by default, if I'm not mistaken other filesystems like XFS/Btrfs/etc. this also needs to explicit enabled.

@rlaager Thanks for the info. :) I'll change/remove these parameters.

Would it be an idea to ship some sort of systemd timer by default?

mailinglists35 commented 5 years ago

However I do agree about not enabling scrub by default, if I'm not mistaken other filesystems like XFS/Btrfs/etc. this also needs to explicit enabled

this is one layer below the filesystem (zfs does not have a fsck) - it's not the filesystem consistency check, but the parity check that is done at logical volume manager level. so a scrub would the the equivalent of a md check, not the equivalent of fsck.xfs

mdadm ships a systemd timer and most distributions enable a monthly check; suse also enables a maintenance script for btrfs which does a scrub. (lvm lacks such a script as it relies on md for raid levels). some hardware raid controllers also have a background parity check. freebsd ships a default disabled periodic config file that requires minimal editing from user (toggle no to yes).

ZFS linux users are left out of this option and each one has to duplicate work having to setup his own cron/systemd script. While I understand the intention of this, I still believe shipping even a disabled systemd or cron entry would ease the effort of Linux endusers to enable the periodic parity checks.

gmelikov commented 5 years ago

+1 to ship disabled cron job. IIRC Proxmox or Debian already ships it enabled, but I'm for disabled variant. +100 to https://github.com/zfsonlinux/zfs/issues/1006#issuecomment-475166385 .

But please don't enable it by default!

beren12 commented 5 years ago

So put some examples in /usr/share/docs/zfsonlinux and let the user/admin decide which to do. Right now I scrub every 2nd sunday at 12:22am but I'm thinking of adding a feature where different pools are scrubbed at different times, and anything not in the specific pool files will be scrubbed at a default time.

my crontab:

12 22 * * Sun root [expr `date +\%V` \% 2-eq 0 ] && for i inzpool list -H -o name; do zpool scrub $i ; done

beren12 commented 5 years ago

I like this better than the 1st & 15th since they could be any day of the week.

beren12 commented 5 years ago

I guess the $i should be quoted "$i" in case someone has a pool with a space in the name

mailinglists35 commented 5 years ago

@beren12 for quality signal to noise ratio lets better keep the discussion of how to do this in a dedicated pull request. thanks @gmelikov for opening the possibility to submit a PR!

gmelikov commented 5 years ago

@mailinglists35 erm, it's offtopic but nobody here has prevented anybody from opening PRs about this issue with disabled cron/etc job.

stefarossi commented 5 years ago

Could you guys please consider adding a config option somewhere to disable the cron job? Maybe an entry in /etc/default/zfs?

I discovered by accident that, after upgrading the zfs packages, my pools were being scrubbed twice: once by a cronjob I wrote years ago, and once by this cronjob. I also have a non-mirrored pool which I'd rather not scrub, to avoid overburdening it. My cronjob excluded that pool, but this cronjob scrubs them all. I have now deleted /etc/cron.d/zfsutils-linux, but I suspect it will be added again the next time I upgrade the zfs packages.

rlaager commented 5 years ago

@stefarossi I believe that script is (still) shipped as part of the Debian package, and not upstream, so this is a request you'd have to take to Debian.