jimsalterjrs / sanoid

These are policy-driven snapshot management and replication tools which use OpenZFS for underlying next-gen storage. (Btrfs support plans are shelved unless and until btrfs becomes reliable.)
http://www.openoid.net/products/
GNU General Public License v3.0
3.14k stars 308 forks source link

`--recursive` option not behaving as expected (likely PEBKAC) #857

Open mddeff opened 1 year ago

mddeff commented 1 year ago

As the tile says, this is likely a PEBKAC/ID10T issue, but I haven't been able to figure it out so I'm sending up a flare. Redirect me as appropriate.

Source:

[mike@fs01]~% uname -a
Linux fs01.svr.zeroent.net 4.18.0-147.5.1.el8_1.x86_64 #1 SMP Wed Feb 5 02:00:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
[mike@fs01]~% zfs --version
zfs-0.8.3-1
zfs-kmod-0.8.3-1
[mike@fs01]~% zfs list dozer1 -o name -r
NAME
dozer1
dozer1/fast1
dozer1/fast1/k8s
dozer1/fast1/k8s/dmz
dozer1/fast1/k8s/internal
dozer1/fast1/spartek
dozer1/fast1/spartek/k8s1
dozer1/fast1/spartek/vms1
dozer1/fast1/vms3
dozer1/tank0
dozer1/tank0/data
dozer1/tank0/spartek
dozer1/tank0/spartek/vms1
dozer1/tank2
dozer1/tank2/docker
dozer1/tank2/docker/docker0
dozer1/tank2/iot
dozer1/tank2/nextcloud
dozer1/tank2/vms

(Inb4 CentOS 8 is dead and I'm running it on a storage array; its on the to-do list. And I'm sure my drastically different versions of ZFS isn't the best either.)

All of those datasets are created, populated, and managed by local (to src) syncoid using my autosyncoid script.

Dest:

[root@backup01 ~]# cat /etc/redhat-release 
Rocky Linux release 9.2 (Blue Onyx)
[root@backup01 ~]# uname -a
Linux backup01.svr.zeroent.net 5.14.0-284.30.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Sep 16 09:55:41 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
[root@backup01 ~]# zfs --version
zfs-2.1.13-1
zfs-kmod-2.1.13-1

On dest, I pre-created the dozer0/ze-fs01/dozer1 dataset and then ran:

syncoid --no-sync-snap --force-delete --recursive --no-privilege-elevation \
  zfsbackup@fs01.svr.zeroent.net:dozer1 \
  dozer0/ze-fs01/dozer1

It successfully creates dozer0/ze-fs01/dozer1/fast and then syncs dozer1/fast1/* to dozer0/ze-fs01/dozer1/fast/* recursively creating all necessary child datasets. Then, when it gets to dozer1/tank2, it barfs:

INFO: Sending oldest full snapshot dozer1/tank2/vms@autosnap_2023-09-01_00:00:07_monthly (~ 195.6 GB) to new target filesystem:                                                                                                               
cannot open 'dozer0/ze-fs01/dozer1/tank2': dataset does not exist                                                                                                                                                                             
cannot receive new filesystem stream: unable to restore to destination                                                 
CRITICAL ERROR: ssh      -S /tmp/syncoid-zfsbackup@fs01.svr.zeroent.net-1697895831-1968 zfsbackup@fs01.svr.zeroent.net ' zfs send  '"'"'dozer1/tank2/vms'"'"'@'"'"'autosnap_2023-09-01_00:00:07_monthly'"'"' | mbuffer  -q -s 128k -m 16M' |  
zfs receive  -s -F 'dozer0/ze-fs01/dozer1/tank2/vms' failed: 256 at /usr/local/sbin/syncoid line 549.

So I manually created all of the child datasets on dest and then re-ran the same command, and now it seems to be working (ish)

CRITICAL: no snapshots exist on source dozer1/tank0, and you asked for --no-sync-snap.
NEWEST SNAPSHOT: autosnap_2023-10-28_17:00:00_hourly
Removing dozer0/ze-fs01/dozer1/tank0/data because no matching snapshots were found
NEWEST SNAPSHOT: autosnap_2023-10-28_17:00:00_hourly
INFO: Sending oldest full snapshot dozer1/tank0/data@syncoid_fs01.svr.zeroent.net_2022-10-12:07:17:08 (~ 3450.1 GB) to new target filesystem:

While fast1 has snapshots (everything under there has the same retention policy, so I'm having sanoid just snap the whole dataset recursively), the tank* datasets do not as they all have mixed usage (and snapshots only are occurring in the child datasets).

It looks like when there's no snapshots (and my usage of --no-sync-snap) on a dataset, syncoid doesn't sync it, but then it doesn't get created on the target system to support creation of child datasets that do have snapshots.

Is this behavior expected or have I found an edge case?

As always, thank you to Jim and the {san,sync,find}oid contributors that enable enterprise-grade storage/backup for the FOSS community!

jimsalterjrs commented 1 year ago

Correct. If you ask for no sync snap and your source has no snapshots, it's impossible to replicate. Literally impossible, since replication is based on snapshots.

I generally recommend not excluding empty parent datasets from your snapshot policy. The empty snapshots don't really cost you anything, and they keep you from having problems like this.

mddeff commented 1 year ago

Yep, totally understood. It's not so much a space thing, rather, I'm creating the snapshots at the child dataset level because they have different uses (and subsequently have different retention policies).

For instance, tank2/vms and tank2/iot have different needs for retention, so the policies are different.

Would it be better to:

A) Set a general Sanoid snapshot policy for tank2 that is the boolean of the policies for vms and iot, and then have the delta policies for each of those child sets? (This feels prone to error having two different policies creating the set of snapshots for a single dataset, but I could be overthinking it.)

B) Point Syncoid at tank2/vms and tank2/iot seperately?

C) Just do what I did where I create blank datasets on the target and let Syncoid take it from there

When I had originally read the --recursive option, my brain auto-completed a mkdir -p type behavior. Is there any reason why that behavior would be unwanted? Effectively just creating blank datasets on the target to support the tree structure of target datasets with actual snapshots to be transferred? Maybe a new flag?

fiesh commented 1 month ago

We have the same issue. We have one system storage0 that all sort of systems sync to. This then replicates to a second system storage1. But since we want to keep all snapshot taking local to the machines sending, but all snapshot removal local to the machines receiving (so as not to have some snapshot bug propagate automatically), there are no automatic snapshots taken on storage0. Now syncing storage0 to storage1 runs into this issue since we cannot simply sync the (unsnapshotted) root dataset.

mkdir -p / zfs create -p behavior would be much easier to deal with.