Open prometheanfire opened 5 years ago
Destroying the remote snapshot enables incremental sends again, but is kinda useless from a real life usage scenario. Is the issue that 'basic' snapshots on the remote end not copying the IV from the previous snapshot (if nothing has changed)?
@tcaputi ^ may interest you
While the error message may be slightly odd, this would fail without encryption anyway so I'm inclined to say Not A Bug.
This has been working this way for the last few years, I'd say it is a bug to break incremental sends like this.
In fact, it does work with unencrypted datasets
so, broken
Interesting. If the filesystem doesn't change at all (atime=off) this does work.
Apologies
atime affects snapshots? I don't think I have I have it enabled anywhere though. Also, I think I set the canmount=no flag on every regular dataset (can't set that on volumes).
No, but atime can result in dataset modification meaning the requirement that the snapshot not be modified to receive an incremental is violated. Raw sends tend to assume the receiver can't decrypt the dataset, and hence can't mount it or modify it.
that's not the error I'm getting (modification). At least I don't think the error has to do with that cannot receive incremental stream: IV set guid mismatch.
It has to do with the encryption check, not which zfs 'commit' or bookmark it is at.
/me is sad that this was not in 0.8.0 (as it seems fairly core to zfs)
This is something which we're going to need to decide if it should be explicitly allowed. I don't think it's unreasonable, it's just a use case we didn't consider and it happened to accidentally work in previous versions. Given that, it wasn't something I felt should hold up the release. Could you explain your specified use case for this. Then we can look in to exactly what's going to be required to properly support this.
explained in irc, but I'll copy it here
I encode info about the date/host that takes the snapshot into the snapshot name, not the cleanest thing to do I also do recursive snapshots on each system, pruning the snaps that I don't want before sending having the host in the snapshot name helps helps in determining the first or the last snap to use when send/recv that's more or less it, I can share the script I use, perhaps the script is just too simple http://dpaste.com/2MXKW1P I think if I changed it to check the last snap on the remote and send from that to the latest snap on the local that'd work
@prometheanfire after looking in to this we decided that while it's probably possible to support this for raw receives. It's also more complicated than it first appears and is not something we have the time to work on right now. @tcaputi has opened #8863 to fix the error message, and I'd suggest updating your scripts if you haven't already done so.
Thanks for the heads up, ya I was going to update the scripts, need to figure out a way to get snapshot ordering I think I may have to switch to pyzfs (not sure if the command line client orders by snapshot time or snapshot name).
Is there a quick summary of the problem (just curious, no need to tell me if it'd take too long)?
Basically, the issue is just that doing the obvious fix caused a cascade of new errors where the code wasn't expecting this. As far as I can tell, this was never really intended to work this way, but the code just didn't properly check for it and happened to ignore that snapshot when doing the actual work.
Is there a reason it can't just ignore 'no diff snapshots' when searching backwards and stop when it either finds an acceptable snap or if the snap starts to have changes? It seems like that was the previous behavior (from the outside).
That was the obvious fix and it didn't work unfortunately. It caused a bunch of issues in the code that swaps out a receive clone when the receive is finished. I didn't see an obvious way to fix this and had other encryption issues to get to. Also, this code is going to be completely refactored anyway with redacted send / receive, so it didn't seem worth it to look into this now.
ah :(
for completeness I worked around taking a snapshot locally via...
# get list of datasets to back up from destination (requires initialization for new datasets)
BACKUP_LIST=$(zfs list -o name -s name -H | grep "${BACKUP_POOL_NAME}" | sed -E -e "s:^${BACKUP_POOL_NAME}(/|$)::g" -e 's:^backups(/|$)::g' -e '/^$/d')
for DATASET in ${BACKUP_LIST}; do
FIRST_SNAP=$(zfs list -t snap -o name -H "${BACKUP_POOL_NAME}/backups/${DATASET}" | tail -n 1 | cut -d@ -f 2)
if $(zfs list -t snap -o name -H | grep "${SOURCE_POOL}/${DATASET}@" | grep -q "${FIRST_SNAP}"); then
LAST_SNAP=$(zfs list -t snap -o name -H "${SOURCE_POOL}/${DATASET}" | tail -n 1 | cut -d@ -f 2)
else
echo "Can not find snapshot ${FIRST_SNAP} in ${SOURCE_POOL}/${DATASET}. quiting"
break
fi
if [[ "${FIRST_SNAP}" == "${LAST_SNAP}" ]]; then
echo "# Looks like ${DATASET} is already backed up, skipping."
continue
fi
SIZE=$(zfs send -LwecpnvP -I "${SOURCE_POOL}/${DATASET}@${FIRST_SNAP}" "${SOURCE_POOL}/${DATASET}@${LAST_SNAP}" | tail -n 1 | awk '{ print $2 }' 2> /dev/null)
if [[ "$(zfs get -H -o value encryption ${SOURCE_POOL}/${DATASET})" != 'off' ]]; then
RECV_SOPT='-s'
else
RECV_SOPT=''
fi
if [[ "$(zfs get -H -o value type ${SOURCE_POOL}/${DATASET})" == 'volume' ]]; then
RECV_CANMOUNT=''
else
RECV_CANMOUNT='-o canmount=off'
fi
echo "zfs send -Lwecp -I ${SOURCE_POOL}/${DATASET}@${FIRST_SNAP} ${SOURCE_POOL}/${DATASET}@${LAST_SNAP} | pv -s "${SIZE}" | zfs recv -duv ${RECV_SOPT} ${RECV_CANMOUNT} ${BACKUP_POOL_NAME}/backups"
done
it's not perfect (doesn't check for intermediary snapshots) and assumes that the last snapshot in the list is the newest (always seemed to be the case, is this something I can rely upon?) but works.
Can I rely upon zfs list -t snap -o name
showing the last snapshot for each dataset being the latest one?
Can I rely upon zfs list -t snap -o name showing the last snapshot for each dataset being the latest one?
While I can't answer your question, I know a way to get the latest snap, just sort by creation time:
zfs list -H -d 1 -t snap -o name -S creation <ds> | head -1
. Replacing -S
with -s
will get the name of the oldest one. You could also sort by createtxg
instead.
Also running into this issue wrapping my send/recvs with syncoid
in an unusual-to-me snapshot use-case where I've been moving (send/recv'ing) a dataset between my desktop and laptop depending on which one I intend to work from over different days this week. After a few snaps back and forth I now face cannot receive incremental stream: IV set guid mismatch. See the 'zfs receive' man page section discussing the limitations of raw encrypted send streams.
.
It is only 13GB and this is a 1GBPS LAN they're on so I just zfs remove -r
'd it on my desktop and sent it fresh from the laptop in a few minutes. But I expect this could happen again while I'm not on as fast a remote link to re-transmit it all.
At the moment both hosts are running zfs-2.2.2-1
though the desktop is on kernel 6.6.9 while the laptop is on 6.6.10, which I would not expect to be relevant right now.
System information
Server code for the recieve end is master as of April 13th, commit b92f5d9f8254f726298a6ab962719fc2b68350b1 I think.
Describe the problem you're observing
Encrypted incremental sends fail if the receiving end has a snapshot taken after the original send was done. (IV missmatch)
Describe how to reproduce the problem
This takes two systems
System 1:
System 2:
System 1:
This will result in the following error.