jimsalterjrs / sanoid

These are policy-driven snapshot management and replication tools which use OpenZFS for underlying next-gen storage. (Btrfs support plans are shelved unless and until btrfs becomes reliable.)
http://www.openoid.net/products/
GNU General Public License v3.0
3.1k stars 303 forks source link

Sanoid creates multiple daily/hourly/monthly snapshots at seemingly random intervals #791

Closed rokyo249 closed 1 year ago

rokyo249 commented 1 year ago

Hi there,

I have a similar issue to this one (https://github.com/jimsalterjrs/sanoid/issues/526) with sanoid creating multiple hourly snapshots in one hour, multiple dailies in a day and multiple monthlies per month. All of them are then labelled "pool/dataset@autosnap_date_time_hourly/daily/monthly" like so:

storage-slow/cloud@autosnap_2022-12-31_17:07:39_monthly                -   575K         -       -              -          -
storage-slow/cloud@autosnap_2022-12-31_17:16:16_monthly                -   496K         -       -              -          -
storage-slow/cloud@autosnap_2022-12-31_17:23:33_monthly                -  1.42M         -       -              -          -
storage-slow/cloud@autosnap_2022-12-31_18:16:09_monthly                -   400K         -       -              -          -
storage-slow/cloud@autosnap_2022-12-31_19:13:09_monthly                -   288K         -       -              -          -
storage-slow/cloud@autosnap_2022-12-31_20:11:09_monthly                -   288K         -       -              -          -
storage-slow/cloud@autosnap_2022-12-31_20:20:17_monthly                -   288K         -       -              -          -
storage-slow/cloud@autosnap_2022-12-31_21:13:11_monthly                -   879K         -       -              -          -

This leads to several hundred partial monthlies created per month, for example: in December 2022 there are 979 "_monthly" snapshots created and kept. Several dozen partial dailies per day and 1-5 partial hourlies per hour.

While I would have expected the behavior to be:

My /etc/sanoid/sanoid.conf is the following:

[storage-fast]
        use_template = production
        recursive = yes
[storage-slow]
        use_template = production
        recursive = yes

[template_production]
        frequent_period = 60
        frequently = 0
        hourly = 8
        daily = 7
        monthly = 2
        yearly = 0
        autosnap = yes
        autoprune = yes

With the frequent_period = 60 I assumed, it would only take one hourly per hour.

My /lib/systemd/system/sanoid.service file looks like this:

[Unit]
Description=Snapshot ZFS Pool
Requires=zfs.target
After=zfs.target
Wants=sanoid-prune.service
Before=sanoid-prune.service
ConditionFileNotEmpty=/etc/sanoid/sanoid.conf

[Service]
Environment=TZ=UTC
Type=oneshot
ExecStart=/usr/sbin/sanoid --take-snapshots --verbose

and is started every minute by the systemd timer in /lib/systemd/system/sanoid.timer:

[Unit]
Description=Run Sanoid Every Minute

[Timer]
OnCalendar=*:0/1
Persistent=true

[Install]
WantedBy=timers.target

Via journalctl, I can see it running every minute and usually it just runs for a few seconds with the output:

Jan 11 15:47:03 storage01 systemd[1]: Starting Snapshot ZFS Pool...
Jan 11 15:47:05 storage01 sanoid[5162]: INFO: taking snapshots...
Jan 11 15:47:05 storage01 systemd[1]: sanoid.service: Succeeded.
Jan 11 15:47:05 storage01 systemd[1]: Started Snapshot ZFS Pool.

but on some occasions it does take snapshots and then usually runs for a few minutes (5-20 mins) and outputs all the snapshots taken (and taking the above-mentioned partial hourlies, dailies and monthlies with the current timestamp):

Jan 11 15:19:02 storage01 systemd[1]: Starting Snapshot ZFS Pool...
Jan 11 15:19:02 storage01 sanoid[7056]: INFO: taking snapshots...
Jan 11 15:19:02 storage01 sanoid[7056]: taking snapshot storage-fast/home/user1@autosnap_2023-01-11_14:19:02_monthly
Jan 11 15:19:02 storage01 sanoid[7056]: taking snapshot storage-fast/home/user1@autosnap_2023-01-11_14:19:02_daily
Jan 11 15:19:03 storage01 sanoid[7056]: taking snapshot storage-fast/home/user1@autosnap_2023-01-11_14:19:02_hourly
Jan 11 15:19:03 storage01 sanoid[7056]: taking snapshot storage-fast/home/user2@autosnap_2023-01-11_14:19:03_monthly
Jan 11 15:19:03 storage01 sanoid[7056]: taking snapshot storage-fast/home/user2@autosnap_2023-01-11_14:19:03_daily
Jan 11 15:19:03 storage01 sanoid[7056]: taking snapshot storage-fast/home/user2@autosnap_2023-01-11_14:19:03_hourly
...
several dozen more users
...
Jan 11 15:21:01 storage01 sanoid[7056]: taking snapshot storage-fast/home/userN@autosnap_2023-01-11_14:21:01_monthly
Jan 11 15:21:01 storage01 sanoid[7056]: taking snapshot storage-fast/home/userN@autosnap_2023-01-11_14:21:01_daily
Jan 11 15:21:01 storage01 sanoid[7056]: taking snapshot storage-fast/home/userN@autosnap_2023-01-11_14:21:01_hourly
Jan 11 15:21:01 storage01 sanoid[7056]: INFO: cache expired - updating from zfs list.
Jan 11 15:28:12 storage01 systemd[1]: sanoid.service: Succeeded.
Jan 11 15:28:12 storage01 systemd[1]: Started Snapshot ZFS Pool.

Since it does that on seemingly random times, I assume it is taking snapshots whenever something has actually changed in those directories (???) and skips taking snapshots when the data wasn't altered?

Or is this behavior only happening because I did not specify:

# hourly - top of the hour
hourly_min = 0
# daily - at 23:59 (most people expect a daily to contain everything done DURING that day)
daily_hour = 23
daily_min = 59
# weekly -at 23:30 each Monday
weekly_wday = 1
weekly_hour = 23
weekly_min = 30
# monthly - immediately at the beginning of the month (ie 00:00 of day 1)
monthly_mday = 1
monthly_hour = 0
monthly_min = 0
# yearly - immediately at the beginning of the year (ie 00:00 on Jan 1)
yearly_mon = 1
yearly_mday = 1
yearly_hour = 0
yearly_min = 0

like in the default config?

Codelica commented 1 year ago

FWIW I'm seeing something similar and am not using systemd timers, I have a */15 cron job running /usr/sbin/sanoid --cron. My template is:

[template_production]
  frequent_period = 15
  frequently = 0
  hourly = 48
  daily = 7
  weekly = 4
  monthly = 2
  yearly = 0
  autosnap = yes
  autoprune = yes

Since frequently is set to 0 (like you) I'd expect only hourly snapshots at the top of the hour -- outside of initial startup. Yet I get seemingly random */15 snapshots taken also. I'm assuming things like hourly_min = 0 in the default conf file are defaults, but perhaps not?

jimsalterjrs commented 1 year ago

With the frequent_period = 60 I assumed, it would only take one hourly per hour.

This setting doesn't mean what you think it means. Frequent_period is a user-definable period for those who want automatic snapshots taken more than once per hour. What you've done is essentially tell your system that you want the period for frequent snapshots to be every 60 minutes... which has nothing to do with hourlies at all.

storage-slow/cloud@autosnap_2022-12-31_17:07:39_monthly - 575K - - - - storage-slow/cloud@autosnap_2022-12-31_17:16:16_monthly - 496K - - - - storage-slow/cloud@autosnap_2022-12-31_17:23:33_monthly - 1.42M - - - -

This is, essentially, a race condition. Your system is struggling to handle the load its been given, and you've wound up with multiple sanoid processes trying to take that snapshot. Since the first one didn't finish, the next Sanoid process can't see that snapshot, and attempts to take it again. Eventually, they all complete.

Basically you need to reduce the load on the system and/or reduce the number of times sanoid is invoked. If you've got many thousands of snapshots on the system, you'll get to the point where a simple zfs list -t snap takes several minutes to complete, rather than completing instantly or near-instantly. If this is happening to you, you'd be advised to reduce the number of snapshots you keep (and/or increase the RAM in your system, allowing you to keep more metadata cached). A CACHE vdev with zfs set secondarycache=metadata on your pool might also help here, but no guarantees on that.

Via journalctl, I can see it running every minute and usually it just runs for a few seconds with the output:

"Running every minute" isn't a good idea if Sanoid can't finish generating a list of snapshots more quickly than that. Try dropping your systemd timer to run every 15 minutes, and see if that helps.

rokyo249 commented 1 year ago

Thanks a lot for your answer!

If you've got many thousands of snapshots on the system, you'll get to the point where a simple zfs list -t snap takes several minutes to complete

Yes, that is exactly what is happening. Completing that command takes several minutes and produces several 1000 lines of output.

"Running every minute" isn't a good idea if Sanoid can't finish generating a list of snapshots more quickly than that. Try dropping your systemd timer to run every 15 minutes, and see if that helps.

I thought that using systemd timers instead of cron jobs would prevent this, since the timers should not be invoked again if the previous process invoked by that timer did not yet finish, while cron would simply start a new invocation regardless (or so I thought).

I will set "frequent_period = 0" in my /etc/sanoid/sanoid.conf and change my /lib/systemd/system/sanoid.timer to "OnCalendar=*:0/15" and see if that works!

EDIT: Will a systemctl reload sanoid suffice to apply the new sanoid.conf or will I need systemctl restart sanoid? I'll probably need a systemctl daemon-reload, too, for the timer, right? Can I do all of these if sanoid is possibly taking/pruning snapshots at that moment?

EDIT2: Yes, both commands worked fine! :-)

rokyo249 commented 1 year ago

Your suggestions worked perfectly!

After setting "frequent_period = 0" and the systemd timer to 15 minutes, only hourly snapshots are taken at exactly the full hour.

Thanks a lot for the help!

putnam commented 1 year ago

With the frequent_period = 60 I assumed, it would only take one hourly per hour.

This setting doesn't mean what you think it means. Frequent_period is a user-definable period for those who want automatic snapshots taken more than once per hour. What you've done is essentially tell your system that you want the period for frequent snapshots to be every 60 minutes... which has nothing to do with hourlies at all.

storage-slow/cloud@autosnap_2022-12-31_17:07:39_monthly - 575K - - - - storage-slow/cloud@autosnap_2022-12-31_17:16:16_monthly - 496K - - - - storage-slow/cloud@autosnap_2022-12-31_17:23:33_monthly - 1.42M - - - -

This is, essentially, a race condition. Your system is struggling to handle the load its been given, and you've wound up with multiple sanoid processes trying to take that snapshot. Since the first one didn't finish, the next Sanoid process can't see that snapshot, and attempts to take it again. Eventually, they all complete.

Basically you need to reduce the load on the system and/or reduce the number of times sanoid is invoked. If you've got many thousands of snapshots on the system, you'll get to the point where a simple zfs list -t snap takes several minutes to complete, rather than completing instantly or near-instantly. If this is happening to you, you'd be advised to reduce the number of snapshots you keep (and/or increase the RAM in your system, allowing you to keep more metadata cached). A CACHE vdev with zfs set secondarycache=metadata on your pool might also help here, but no guarantees on that.

Via journalctl, I can see it running every minute and usually it just runs for a few seconds with the output:

"Running every minute" isn't a good idea if Sanoid can't finish generating a list of snapshots more quickly than that. Try dropping your systemd timer to run every 15 minutes, and see if that helps.

I have run into this issue. I have many datasets with multiple very large pools on the same host. It doesn't really make sense to me that users should manually adjust timers and guess how long it might take. Shouldn't sanoid be atomic in nature and take care of multiple running instances rather than generate lots of incorrect/unnecessary snaps? It's also a feedback loop since this makes pruning take even longer.