openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.65k stars 1.75k forks source link

zfs receive -F cannot be used to destroy an encrypted filesystem #6793

Open sjau opened 7 years ago

sjau commented 7 years ago

System information

Type Version/Name
Distribution Name Nixos
Distribution Version Unstable Small
Linux Kernel 4.9.58
Architecture x64
ZFS Version 0.7.0-1
SPL Version 0.7.0-1

Describe the problem you're observing

Using an encrypted dataset with several child sets on my notebook and homeserver. I wanted to setup automatic snapshot backup services to my homeset using znapzend.

However it complains about:

cannot receive new filesystem stream: zfs receive -F cannot be used to destroy an encrypted filesystem

Describe how to reproduce the problem

I setup the rules for the first dataset for testing:

#!/usr/bin/env bash

znapzendzetup create                                                           \
    --recursive                                                                \
    --tsformat='%Y-%m-%d_%H-%M-%S'                                             \
    SRC '1h=>15min,1d=>1h,7d=>1d' tank/encZFS/VMs                              \
    DST:notebook '1h=>15min,1d=>1h,30d=>1d' root@10.200.0.3:serviTank/encZFS/BU/subi/VMs

and then I run znapzend --runonce=tank/encZFS/VMs -d --autoCreation

Include any warning/errors/backtraces from the system logs

That's the log output I got:

root@subi:~/.nixos# znapzend --runonce=tank/encZFS/VMs -d --autoCreation
[Fri Oct 27 15:33:56 2017] [info] znapzend (PID=16948) starting up ...
[Fri Oct 27 15:33:56 2017] [info] refreshing backup plans...
[Fri Oct 27 15:33:57 2017] [info] found a valid backup plan for tank/encZFS/VMs...
[Fri Oct 27 15:33:57 2017] [info] znapzend (PID=16948) initialized -- resuming normal operations.
[Fri Oct 27 15:33:57 2017] [debug] snapshot worker for tank/encZFS/VMs spawned (17097)
[Fri Oct 27 15:33:57 2017] [info] creating recursive snapshot on tank/encZFS/VMs
# zfs snapshot -r tank/encZFS/VMs@2017-10-27_15-33-57
[Fri Oct 27 15:34:03 2017] [debug] snapshot worker for tank/encZFS/VMs done (17097)
[Fri Oct 27 15:34:03 2017] [debug] send/receive worker for tank/encZFS/VMs spawned (18065)
[Fri Oct 27 15:34:03 2017] [info] starting work on backupSet tank/encZFS/VMs
# zfs list -H -r -o name -t filesystem,volume tank/encZFS/VMs
[Fri Oct 27 15:34:03 2017] [debug] sending snapshots from tank/encZFS/VMs to root@10.200.0.3:serviTank/encZFS/BU/subi/VMs
# zfs list -H -o name -t snapshot -s creation -d 1 tank/encZFS/VMs
# ssh -o batchMode=yes -o ConnectTimeout=30 root@10.200.0.3 zfs list -H -o name -t snapshot -s creation -d 1 serviTank/encZFS/BU/subi/VMs
# zfs send tank/encZFS/VMs@2017-10-27_15-33-57|ssh -o batchMode=yes -o ConnectTimeout=30 'root@10.200.0.3' 'zfs recv -F serviTank/encZFS/BU/subi/VMs'
cannot receive new filesystem stream: zfs receive -F cannot be used to destroy an encrypted filesystem
warning: cannot send 'tank/encZFS/VMs@2017-10-27_15-33-57': signal received
[Fri Oct 27 15:34:03 2017] [warn] ERROR: cannot send snapshots to serviTank/encZFS/BU/subi/VMs on root@10.200.0.3
# ssh -o batchMode=yes -o ConnectTimeout=30 root@10.200.0.3 zfs list -H -o name -t snapshot -s creation -d 1 serviTank/encZFS/BU/subi/VMs
[Fri Oct 27 15:34:03 2017] [debug] cleaning up snapshots on root@10.200.0.3:serviTank/encZFS/BU/subi/VMs
[Fri Oct 27 15:34:03 2017] [warn] ERROR: suspending cleanup source dataset because at least one send task failed
[Fri Oct 27 15:34:03 2017] [info] done with backupset tank/encZFS/VMs in 0 seconds
[Fri Oct 27 15:34:03 2017] [debug] send/receive worker for tank/encZFS/VMs done (18065)
tcaputi commented 7 years ago

Unfortunately this is a design limitation for now and I don't really know of a way to fix it. Management for it will probably have to be added to znapzend.

For developers, the problem comes from the way that the zfs recv -F command works in the kernel. ZFS wants to keep both copies of the dataset around until the new one has been completely received. This way, if the receive fails,we can seamlessly fall back to the old one. However, the implementation accomplishes this by calling the new dataset a "clone" of the old one. Encryption enforces that clones must use the same key as their origin, but this may not be the case with zfs recv -F since it is completely legal to do the following:

zfs send pool/crypt1 | zfs recv pool/backup
zfs send pool/crypt2 | zfs recv -F pool/backup

Technically it is possible to make this work if both of these datasets use the same key. However, I talked with @ahrens about it a bit and we decided that it would be too confusing from a user interface perspective to make a command that works in only a few cases based on hidden properties.

I don't really have a good way to fix this so well leave it as an open issue for now, but I think for the time being it might be best if znapzend was able to work around this.

sjau commented 7 years ago

Thanks for the feedback. I asked for znapzend if there's a way to not enforce the -F option. In my case as it should act only for backup, the sent snapshots shouldn't need a since they're not in use...

P.S.: Encryption so far works well.

tristan-k commented 7 years ago

So how do I migrate a luks encrypted zpool to a native encrypted zpool if I cant use the -F flag?

$ sudo zpool create -f -o ashift=12 -O casesensitivity=insensitive -O normalization=formD -O compression=lz4 -O xattr=sa -O acltype=posixacl -O atime=off -O relatime=off -O encryption=on -O keyformat=passphrase NATIVE_ENCRYPTED_ZPOOL /dev/sdXY
$ zfs send -R LUKS_ZPOOL@Snapshot | mbuffer -s 128k -m 2G -o - | zfs receive -s -u -v -F -d NATIVE_ENCRYPTED_ZPOOL
cannot receive new filesystem stream: zfs receive -F cannot be used to destroy an encrypted filesystem

What would be the substitute for the -F flag?

If I run it without the -F flag than it complains about

cannot receive new filesystem stream: destination 'NATIVE_ENCRYPTED_ZPOOL' exists
must specify -F to overwrite it
tcaputi commented 7 years ago

This issue isn't about the -R flag. It's about -F. You should be able to accomplish this by creating an encrypted root dataset and then receiving your -R stream below it. This will cause all datasets to inherit their wrapping keys from the original encrypted root. Afterwards, you can use zfs rename and zfs change-key to restructure the datasets however you wish

Soon I will be making a pull request to support doing things like zfs recv -o encryption=on ... but this does not exist at the moment.

tristan-k commented 7 years ago

You are right. I typed -R by mistake and meant -F. I edited my post.

You should be able to accomplish this by creating an encrypted root dataset and then receiving your -R stream below it.

I'm sorry. How do I do this? Do you mean creating a unencrypted vdev like $ zpool create ZPOOL /dev/sdXY and afterwards a encrypted dataset with$ zfs create -o encryption=on -o keyformat=passphrase ZPOOL/NATIVE_ENCRYPTED followed by $ zfs send -R LUKS_ZPOOL@Snapshot | zfs receive -s -u -v -F -d ZPOOL/NATIVE_ENCRYPTED?

tcaputi commented 7 years ago

@tristan-k no. You must create an encrypted dataset and then receive the stream below it. The encrypted dataset can be at the pool level or any other level you like. For instance:

zpool create -O encryption=on -O keyformat=passphrase ZPOOL sdb
zfs send -R LUKS_ZPOOL@Snapshot | zfs receive -s -u -v -d ZPOOL/NATIVE_ENCRYPTED

The -F flag should not be necessary here. That is only required if you have existing data to overwrite.

behlendorf commented 7 years ago

@tcaputi @sjau thanks for clearly explaining the design decision which was made here. I've marked this bug as documentation for future reference, and am closing the issue.

tristan-k commented 7 years ago

@tcaputi Thanks for your help. I followed your suggestion but I'm still unable to send the snapshot. I will get:

$ zfs send -R LUKS_ZPOOL@Snapshot | zfs receive -s -u -v -d ZPOOL/NATIVE_ENCRYPTED
cannot receive: specified fs (ZPOOL/NATIVE_ENCRYPTED) does not exist

If I do create the dataset it will complain again about:

cannot receive new filesystem stream: destination 'ZPOOL/NATIVE_ENCRYPTED' exists
must specify -F to overwrite it

I'm puzzled.

tcaputi commented 7 years ago

I apologize. I don't usually use the -d flag and I misunderstood its use. You only need ZPOOL as the last argument of the receive. Additionally, I forgot that doing zfs send -R with unencrypted datasets will send the encryption=off property along with them. Right now you cannot override this property, meaning your received dataset can't be encrypted. I plan on submitting a patch either this week or next to address this issue, allowing you to add -o encryption=on -o keyformat=passphrase to the receive command. This will make this whole process a lot cleaner and you wont need the intermediate encrypted dataset to receive beneath.

Right now the only way to do this is to manually iterate through all of the datasets in your pool and send them without using -R or -p. You might just want to wait until I make the PR so that you can simply do zfs send -R natively.

Sorry for the misunderstanding. I'll try to remember to make a post here when the PR is made.

tristan-k commented 6 years ago

Any news on the issue?

tcaputi commented 6 years ago

Sorry for the delay. I have been busy trying to finish #6864 but I should be done with that soon. Afterwards I will be working on this.

sjau commented 6 years ago

@tcaputi just take your time :)

tristan-k commented 6 years ago

@tcaputi No reason for apologizing. Just wanted to check in. Take your time. Thanks!

prometheanfire commented 6 years ago

Also cannot use -e to send with embedded_data

tcaputi commented 6 years ago

@prometheanfire Encrypted datasets cannot use the embedded_data feature since the space where embedded data would go is instead used to store encryption parameters. Does this need to be clarified somewhere?

prometheanfire commented 6 years ago

It may be nice to note it is all (I imagined something like what you said was the case). Man page didn't say it in the send/recv sections is all (but does in the encryption section).

aarononeal commented 6 years ago

I don't know if there is an easier way, but the only solution I could figure out to send all snapshots when migrating an unencrypted to encrypted dataset was:

zfs send -v bak1/data@first | zfs recv -v pool1/data
zfs rollback pool1/data@first
zfs send -v -I bak1/data@first bak1/data@last | zfs recv -v pool1/data
gregwalters commented 6 years ago

Tom,

How's the patch for -o encryption=on -o keyformat=passphrase in receive command coming along? I've been chomping at the bit for this feature.

tcaputi commented 6 years ago

My apologies. I have been very busy with a few other projects. This is next on my list and I am hoping to have a PR out by the end of next week.

tristan-k commented 6 years ago

I'm sorry for pestering but has this been fixed in the latest release?

tcaputi commented 6 years ago

Encryption is not currently in any release at all. The patch is just about ready. There is one small problem that I haven't been able to solve for a few weeks but at this point I should probably just disable it and get the rest pushed up. Unfortunately I am on vacation now and so it will have to wait until a week from now.

tristan-k commented 6 years ago

Btw. is there ETA for encryption in the release channel? Again sorry for bothering. I wish you a relaxing vacation.

runderwo commented 4 years ago

@tcaputi Any luck getting that last part polished up?

tcaputi commented 4 years ago

The bug I was working on at the time was reslved a long time ago. This issue, however, is still not resolved (and I unfortunately I don't really see a good a way to make it work).

To describe the problem in a little more detail, the issue has to do with key management. When you do zfs recv -F, zfs internally does all of the new receive in a hidden clone of the old dataset. If anything happens during the receive, the clone is simply dropped and you still have the existing dataset. For unencrypted datasets, this works great. For encrypted datasets the problem is that there is only one key per "clone family" (a dataset + all of its snapshots and clones). So if I do a zfs recv -F, which encryption key do i store while the receive is taking place? The original dataset and the clone we are receiving into may have different keys, or you may be attempting to replace an unencrypted dataset with an encrypted one (or visa versa). We need to make sure that anything we are left with after a crash is still consistent. It is for this reason,that we currently just disallow this. You can still use zfs recv -F with encrypted datasets if you are not trying to replace the dataset outright.

Perhaps there are some things we could do about this, but it just hasn't been a huge priority (as far as I know). We could try to change the code so tat datasets that are being received don't have to follow the same rules as other datasets. I'd need to spend more time looking into it. We could also potentially change the receive code so that recv -F doesn't use a clone of the existing dataset. This has some added complications in terms of backwards compatibility though.

In the meantime you can workaround this issue simply by doing:

zfs destroy <ds>
zfs recv <ds>

If you want to guarantee that you can keep your datasets if the receive fails you can instead do:

zfs recv <other_ds>
zfs rename <ds> <ds_old>
zfs rename <other_ds> <ds>
zfs destroy <ds_old>

Hope that helps.

alexsmartens commented 1 year ago

Any plans to fix this issue anytime soon?

dm17 commented 1 year ago

Still hitting this and deleting the ds before receiving did not change the situation. I'll just recreate the receiving pool as unencrypted before sending an encrypted ds, but this does seem like an unnecessary hiccup to hit.

This seems super basic... How are people with zfs root backups supposed to restore them without doing some acrobatics? That functionality seems fundamental.