psy0rz / zfs_autobackup

ZFS autobackup is used to periodicly backup ZFS filesystems to other locations. Easy to use and very reliable.
https://github.com/psy0rz/zfs_autobackup
GNU General Public License v3.0
580 stars 62 forks source link

encryptionroot is not preserved when sending hierarchy of encrypted datasets #236

Closed digitalsignalperson closed 3 weeks ago

digitalsignalperson commented 8 months ago

Example using master commit 6d4f22b69edc87442b7e199d6b86684eb41c2943

  1. Create encryption root and a dataset within it
zfs create -o canmount=off -o mountpoint=none -o encryption=on -o keylocation=prompt -o keyformat=passphrase xpool/enc2
zfs create xpool/enc2/test
  1. Take snapshots and send the hierarchy with zfs-autobackup
zfs snapshot xpool/enc2@snap0
zfs snapshot xpool/enc2/test@snap1
zfs set autobackup:enc2test=true xpool/enc2

# new destination:
zfs create xpool/enc3

zfs-autobackup -v \
    --no-holds \
    --no-thinning \
    --no-snapshot \
    --other-snapshots \
    --min-change 1 \
    --strip-path=1 \
    --clear-mountpoint \
    --filter-properties mountpoint \
    enc2test \
    xpool/enc3
  1. Observe that zfs load-key -r will prompt for passphrase for the original encryption root, as well as each child dataset

See the encryptionroot property is received wrong

originally zfs get encryptionroot -r xpool/enc2:

NAME                   PROPERTY        VALUE       SOURCE
xpool/enc2             encryptionroot  xpool/enc2  -
xpool/enc2@snap0       encryptionroot  xpool/enc2  -
xpool/enc2/test        encryptionroot  xpool/enc2  -
xpool/enc2/test@snap1  encryptionroot  xpool/enc2  -

received zfs get encryptionroot -r xpool/enc3:

NAME                        PROPERTY        VALUE                 SOURCE

xpool/enc3                  encryptionroot  -                     -
xpool/enc3/enc2             encryptionroot  xpool/enc3/enc2       -
xpool/enc3/enc2@snap0       encryptionroot  xpool/enc3/enc2       -
xpool/enc3/enc2/test        encryptionroot  xpool/enc3/enc2/test  -
xpool/enc3/enc2/test@snap1  encryptionroot  xpool/enc3/enc2/test  -

command executed by zfs_autobackup per zpool history xpool:

2023-12-18.18:18:50 zfs create -o canmount=off -o mountpoint=none -o encryption=on -o keylocation=prompt -o keyformat=passphrase xpool/enc2
2023-12-18.18:18:57 zfs create xpool/enc2/test
2023-12-18.18:19:01 zfs snapshot xpool/enc2@snap0
2023-12-18.18:19:21 zfs snapshot xpool/enc2/test@snap1
2023-12-18.18:19:25 zfs set autobackup:enc2test=true xpool/enc2
2023-12-18.18:20:29 zfs create xpool/enc3
2023-12-18.18:20:31 zfs recv -u -x mountpoint -o canmount=noauto -v -s xpool/enc3/enc2
2023-12-18.18:20:31 zfs recv -u -x mountpoint -o canmount=noauto -v -s xpool/enc3/enc2/test

So I discovered this after I've sent a few TB of datasets and each received dataset is it's own encryption root. Trying to find info in the openzfs issuess, I'm not totally clear how this is supposed to work.

Maybe we have to hit this recv_fix_encryption_hierarchy function which is only triggered for a recursive send/receive?

https://github.com/openzfs/zfs/commit/bb61cc31851098ab41a7dcb56333a90b40d65129#diff-ade451cd0b2212cb5979053cf5202e98ff65d5e2283431841b28ca17770a3fd0

But does that mean you can never update just one dataset at a time, you have to batch send the whole thing with -R at once?

I think you can "fix" the received datasets by unlocking them and doing zfs change-key -i to each. But there also seems to be bugs related to that and high chance for data loss https://github.com/openzfs/zfs/issues/12123

Indeed if I do zfs change-key -i xpool/enc3/enc2/test then zfs get encryptionroot -r xpool/enc3 is fixed

NAME                        PROPERTY        VALUE            SOURCE
xpool/enc3                  encryptionroot  -                -
xpool/enc3/enc2             encryptionroot  xpool/enc3/enc2  -
xpool/enc3/enc2@snap0       encryptionroot  xpool/enc3/enc2  -
xpool/enc3/enc2/test        encryptionroot  xpool/enc3/enc2  -
xpool/enc3/enc2/test@snap1  encryptionroot  xpool/enc3/enc2  -

Sending a new change seems to keep this fixed encryption root

zfs unload-key -r xpool/enc3
zfs snapshot xpool/enc2/test@snap2
zfs-autobackup -v \
    --no-holds \
    --no-thinning \
    --no-snapshot \
    --other-snapshots \
    --min-change 1 \
    --strip-path=1 \
    --clear-mountpoint \
    --filter-properties mountpoint \
    enc2test \
    xpool/enc3

zfs get encryptionroot -r xpool/enc3
NAME                        PROPERTY        VALUE            SOURCE
xpool/enc3                  encryptionroot  -                -
xpool/enc3/enc2             encryptionroot  xpool/enc3/enc2  -
xpool/enc3/enc2@snap0       encryptionroot  xpool/enc3/enc2  -
xpool/enc3/enc2/test        encryptionroot  xpool/enc3/enc2  -
xpool/enc3/enc2/test@snap1  encryptionroot  xpool/enc3/enc2  -
xpool/enc3/enc2/test@snap2  encryptionroot  xpool/enc3/enc2  -

but creating any new child datasets, they will not be sent with correct encryption root

zfs create xpool/enc2/test2
zfs snapshot xpool/enc2/test2@asdf
zfs-autobackup -v \
    --no-holds \
    --no-thinning \
    --no-snapshot \
    --other-snapshots \
    --min-change 1 \
    --strip-path=1 \
    --clear-mountpoint \
    --filter-properties mountpoint \
    enc2test \
    xpool/enc3

zfs get encryptionroot -r xpool/enc3
NAME                        PROPERTY        VALUE                  SOURCE
xpool/enc3                  encryptionroot  -                      -
xpool/enc3/enc2             encryptionroot  xpool/enc3/enc2        -
xpool/enc3/enc2@snap0       encryptionroot  xpool/enc3/enc2        -
xpool/enc3/enc2/test        encryptionroot  xpool/enc3/enc2        -
xpool/enc3/enc2/test@snap1  encryptionroot  xpool/enc3/enc2        -
xpool/enc3/enc2/test@snap2  encryptionroot  xpool/enc3/enc2        -
xpool/enc3/enc2/test2       encryptionroot  xpool/enc3/enc2/test2  -
xpool/enc3/enc2/test2@asdf  encryptionroot  xpool/enc3/enc2/test2  -
digitalsignalperson commented 8 months ago

I think the solution is to always send the dataset with encryption root using -R, but you can optionally use -X to exclude any datasets not in the current selection.

related issue for -R support https://github.com/psy0rz/zfs_autobackup/issues/36

psy0rz commented 8 months ago

can you run with --debug and copy paste the zfs send/recv line? i want to make sure we're not explicitly filtering out properties.

digitalsignalperson commented 8 months ago
zfs-autobackup -v \
                                                            --no-holds \
                                                            --no-thinning \
                                                            --no-snapshot \
                                                            --other-snapshots \
                                                            --min-change 1 \
                                                            --strip-path=1 \
                                                            --clear-mountpoint \
                                                            --filter-properties mountpoint \
                                                            --debug \
                                                            enc2test \
                                                            xpool/enc3
  zfs-autobackup v3.3-beta.2 - (c)2022 E.H.Eefting (edwin@datux.nl)

  NOTE: Source and target are on the same host, excluding target-path from selection.

  Current time               : 2023-12-20 02:30:39
  Selecting dataset property : autobackup:enc2test
  Snapshot format            : enc2test-%Y%m%d%H%M%S
  Timezone                   : Local

  #### Source settings

  #### Selecting
# [Source] Getting selected datasets
# [Source] CMD    > (zfs get -t volume,filesystem -o name,value,source -H autobackup:enc2test)
# [Source] xpool/enc2: Checking if dataset is changed
  [Source] xpool/enc2: Selected
# [Source] xpool/enc2/test: Checking if dataset is changed
  [Source] xpool/enc2/test: Selected

  #### Target settings
  [Target] Receive datasets under: xpool/enc3

  #### Synchronising
# [Target] xpool/enc3: Checking if dataset exists
# [Target] CMD    > (zfs list xpool/enc3)
# Checking target names:
# [Source] xpool/enc2: -> xpool/enc3/enc2
# [Source] xpool/enc2/test: -> xpool/enc3/enc2/test
# [Source] zpool xpool: Getting zpool properties
# [Source] CMD    > (zpool get -H -p all xpool)
# [Target] zpool xpool: Getting zpool properties
# [Target] CMD    > (zpool get -H -p all xpool)
# [Source] xpool/enc2: Getting zfs properties
# [Source] CMD    > (zfs get -H -o property,value -p all xpool/enc2)
# [Target] xpool/enc3/enc2: Determining start snapshot
# [Target] xpool/enc3/enc2: Checking if dataset exists
# [Target] CMD    > (zfs list xpool/enc3/enc2)
# [Target] STDERR > cannot open 'xpool/enc3/enc2': dataset does not exist
# [Source] xpool/enc2: Dataset should exist
# [Source] xpool/enc2: Getting snapshots
# [Source] CMD    > (zfs list -d 1 -r -t snapshot -H -o name xpool/enc2)
# [Target] xpool/enc3/enc2: Creating virtual target snapshots
# [Source] xpool/enc2@snap0: Transfer snapshot to xpool/enc3/enc2
  [Source] xpool/enc2@snap0: -> xpool/enc3/enc2 (new)
# [Source] CMD    > (zfs send -L zfs_autobackup_option_test)
# [Source] STDERR > cannot open 'zfs_autobackup_option_test': dataset does not exist
# [Source] CMD    > (zfs send -e zfs_autobackup_option_test)
# [Source] STDERR > cannot open 'zfs_autobackup_option_test': dataset does not exist
# [Source] CMD    > (zfs send -c zfs_autobackup_option_test)
# [Source] STDERR > cannot open 'zfs_autobackup_option_test': dataset does not exist
# [Target] CMD    > (zfs recv -s zfs_autobackup_option_test)
# [Target] STDERR > cannot receive: failed to read from stream
# [Target] xpool/enc3/enc2@snap0: Enabled resume support
# [Target] CMD    > (zfs send -L -e --raw -v -P -p xpool/enc2@snap0) | (zfs recv -u -x mountpoint -o canmount=noauto -v -s xpool/enc3/enc2)
# [Source] STDERR > full        xpool/enc2@snap0        71280
# [Source] STDERR > size        71280
# [Target] xpool/enc3/enc2@snap0: Checking if dataset exists
# [Target] CMD    > (zfs list xpool/enc3/enc2@snap0)
# [Target] xpool/enc3/enc2: Auto mounting
# [Target] xpool/enc3/enc2: Getting zfs properties
# [Target] CMD    > (zfs get -H -o property,value -p all xpool/enc3/enc2)
# [Source] xpool/enc2/test: Getting zfs properties
# [Source] CMD    > (zfs get -H -o property,value -p all xpool/enc2/test)
# [Target] xpool/enc3/enc2/test: Determining start snapshot
# [Target] xpool/enc3/enc2/test: Checking if dataset exists
# [Target] CMD    > (zfs list xpool/enc3/enc2/test)
# [Target] STDERR > cannot open 'xpool/enc3/enc2/test': dataset does not exist
# [Source] xpool/enc2/test: Dataset should exist
# [Source] xpool/enc2/test: Getting snapshots
# [Source] CMD    > (zfs list -d 1 -r -t snapshot -H -o name xpool/enc2/test)
# [Target] xpool/enc3/enc2/test: Creating virtual target snapshots
# [Source] xpool/enc2/test@snap1: Transfer snapshot to xpool/enc3/enc2/test
  [Source] xpool/enc2/test@snap1: -> xpool/enc3/enc2/test (new)
# [Target] xpool/enc3/enc2/test@snap1: Enabled resume support
# [Target] CMD    > (zfs send -L -e --raw -v -P -p xpool/enc2/test@snap1) | (zfs recv -u -x mountpoint -o canmount=noauto -v -s xpool/enc3/enc2/test)
# [Source] STDERR > full        xpool/enc2/test@snap1   71280
# [Source] STDERR > size        71280
# [Target] xpool/enc3/enc2/test@snap1: Checking if dataset exists
# [Target] CMD    > (zfs list xpool/enc3/enc2/test@snap1)
# [Target] xpool/enc3/enc2/test: Auto mounting
# [Target] xpool/enc3/enc2/test: Getting zfs properties
# [Target] CMD    > (zfs get -H -o property,value -p all xpool/enc3/enc2/test)

As far as I could figure out, to preserve encryptionroot, the first send has to be with -R on the encryption root and including the children, but then subsequent sends can be the individual children datasets. Later, any new datasets within an encryption root need to be sent with -R always, or else it will be received with it being it's own encryption root. The zfs send -X option might help here, where you could manually exclude every dataset from the -R package, except for the encryptionroot dataset, and whichever ones are in your replication selection.

I opened an issue in the openzfs repo to get clarity in the docs or otherwise change the behaviour

psy0rz commented 8 months ago

ugh that sucks, i hope there is some other way

psy0rz commented 8 months ago

what happens when you just zfs inherit encryptionroot ... on that dataset?

or is it only possible to use zfs change-key -i to do that? (possibly triggering that zfs bug)

digitalsignalperson commented 8 months ago

trying to inherit gives encryptionroot property is read-only and similarly it can't be set with zfs set. Yeah I think change-key is the only way to "fix"

digitalsignalperson commented 8 months ago

I did some testing and didn't get any issues with zfs change-key -i. But it's pretty annoying to need to do that.

psy0rz commented 8 months ago

yeah it is, you could use some "zfs list ..|xargs zfs change-key -i" magic perhaps?

digitalsignalperson commented 8 months ago

Yeah for sure some kind of fixup script could be made. It would also have to load the key for each dataset for zfs change-key -i to work (which seems unnecessary given the encryptionroot has the same IV set). And I may not want to load the key on some servers (that was an advantage of ZFS encryption that the datasets and snapshots can be managed without loading the keys), but I guess in those cases you could leave the incorrect encryptionroot since you won't be mounting them locally.

psy0rz commented 8 months ago

yeah, you only would need to do that if you want to access the backups on the backup server in that case. so its fine in that case.

however if you use it as a replication tool it can be quite annoying i think. In that case you want the source and target to be matching as much as possible regarding properties, which it wont be by default because of this issue.

digitalsignalperson commented 8 months ago

Indeed it would be great to just have it be the same on both sides. I started looking at how it could be fixed in libzfs_sendrecv.c. I think zfs_receive_one() needs a pathway to call lzc_change_key(fsname, DCP_CMD_FORCE_INHERIT, either directly or through recv_fix_encryption_hierarchy(). I tried a one line of code patch but no luck so far. https://github.com/openzfs/zfs/issues/15687#issuecomment-1872665919 but I would have to poke around more. Also I wonder shouldn't it be easy to confirm that a child datast has the same IV set or same key as the parent?

A workaround I'm considering is for the initial replication to send 1000 dataset "slots" like rpool/enc/0000...rpool/enc/9999 to every server so that I never have to add a new dataset and would never have to zfs change-key -i. Actual dataset names I could map in zfs props or in controlled file/script that handles mounting, and unused datasets are simply not mounted. "Deleting" a dataset is a little weird unless you can rollback to a snapshot at the initial empty state.

digitalsignalperson commented 7 months ago

I was trying to patch zfs-utils, but that still has issues.

It turns out there is already a solution and it could be easily added to zfs_autobackup if the system has pyzfs included.

Basically

import libzfs_core
libzfs_core.lzc_change_key(b"testpool/enc_copy/data3", "force_inherit")

where testpool/enc_copy/data3 is the received dataset. I'm curious why the zfs cli doesn't expose this option with zfs change-key, and wonder what else might be useful in the pyzfs functions.

Full example:

# Create test pool
dd if=/dev/zero of=/root/zpool bs=1M count=128
zpool create testpool /root/zpool

# Create encryptionroot and some datasets
echo "12345678" | zfs create -o canmount=off -o mountpoint=/mnt -o encryption=on -o keylocation=prompt -o keyformat=passphrase testpool/enc
zfs create testpool/enc/data1
zfs create testpool/enc/data2
touch /mnt/data1/x
touch /mnt/data2/x
zfs snapshot -r testpool/enc@1

# Make a recursive copy of the encryption root
zfs send -Rw testpool/enc@1 | zfs recv testpool/enc_copy

# Make a new dataset on the original encryption root and try to send to the new one
zfs create testpool/enc/data3
touch /mnt/data3/x
zfs snapshot testpool/enc/data3@x
zfs send -Rw testpool/enc/data3@x | zfs recv testpool/enc_copy/data3

# Fix the encryptionroot ourselves:
python -c 'import libzfs_core; libzfs_core.lzc_change_key(b"testpool/enc_copy/data3", "force_inherit")'
psy0rz commented 7 months ago

Interesting.

Offcourse the whole point of zfs-autobackup is we dont use libzfs and only use regular zfs commands to make debugging easier.

So while tempting, i dont think we should add this to the code?

We should add it to the wiki page about encryption i think.

digitalsignalperson commented 7 months ago

Hmm, what if we had the ability to add a custom hook or command that is executed after a dataset is received, where the argument passed is the dataset/filesystem name? Similar to pre/post snapshot command.

digitalsignalperson commented 7 months ago

PR to add this to the zfs cli https://github.com/openzfs/zfs/pull/15821

psy0rz commented 7 months ago

Hmm, what if we had the ability to add a custom hook or command that is executed after a dataset is received, where the argument passed is the dataset/filesystem name? Similar to pre/post snapshot command.

We can. Could be usefull for other situations as well. A post-create that only runs after initial creation of a dataset?

psy0rz commented 7 months ago

PR to add this to the zfs cli openzfs/zfs#15821

Awesome! thanks for try to improve zfs as well.

digitalsignalperson commented 7 months ago

A post-create that only runs after initial creation of a dataset?

That would work. It might also be useful in the general case for any post-receive to do things like custom mounting or logging/notifications. Maybe if one arg is the filesystem name, another could be if the dataset is new. Or something like --post-receive-cmd="my-program {filesystem} {is_new}" where the template args may be used.

Possibly even a hacky --post-receive-cmd="bash -c 'if [ {is_new} -eq 1]; then python -c \"import libzfs_core; libzfs_core.lzc_change_key(b\'{filesystem}\', \'force_inherit\')\"'

psy0rz commented 3 weeks ago

ive created a ticked for those hooks.

since it as zfs issue ill close this for now.