openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.44k stars 1.73k forks source link

Verify send/recv compatibility of non-legacy checksum algorithms #13795

Open problame opened 2 years ago

problame commented 2 years ago

The blake3 PR (https://github.com/openzfs/zfs/pull/12918) added DMU_BACKUP_FEATURE_BLAKE3 but the symbol is effectively unused.

So, with the blake3 code in master, if a block is checksummed using blake3, then dmu_send.c will simply set

drrw->drr_checksumtype = BP_GET_CHECKSUM(bp);

Unless I’m missing a safeguard somewhere, this will allow sneaking in a blake3-checksummed block into a pool that runs software doesn’t even know blake3.

So, we should be setting that flag if the dataset uses blake3.

An additional twist to the whole thing: Nutanix uses the 1<<28 bit for an internal feature which we (might) want to upstream. Originally, I wanted to remove the seemingly unused DMU_BACKUP_FEATURE_BLAKE3 and put a comment there, indicating that the bit is used by Nutanix. How do we resolve this?

(Conversation on Slack: https://openzfs.slack.com/archives/C052RGXL5/p1661348591743419 )

cc @mcmilk @rincebrain @ahrens

mcmilk commented 2 years ago

Yes, the DMU_BACKUP_FEATURE_BLAKE3 flag can be safely removed.

problame commented 2 years ago

Well, it will compile, but please comment on

Unless I’m missing a safeguard somewhere, this will allow sneaking in a blake3-checksummed block into a pool that runs software doesn’t even know blake3.

mcmilk commented 2 years ago

Well, it will compile, but please comment on

Unless I’m missing a safeguard somewhere, this will allow sneaking in a blake3-checksummed block into a pool that runs software doesn’t even know blake3.

I am new to OpenZFS - so I am not familiar with the internals of send/recv. I think others within the OpenZFS community will have more plan of it. You also ;-)

problame commented 1 year ago

Summary of conversation with @behlendorf about this:

The scenario of smuggling in a blake3-checksummed block cannot happen because there's a check in receive_object that will fail receive with EINVAL if the checksum enum value is unknown. My point was that there's still risk if the receiving software uses the enum value for a different checksum (e.g., one that hasn't been upstreamed yet). The response to this is:

That's true, the whole scheme has always depended on there being a single authoritative zio_checksum enum. It'd probably be a good idea to add a comment to include/sys/zio.h to make it clear when adding a new checksum that value should be reserved upstream. We did add DMUBACKUP* flags for the new compression algorithms, it's a shame we didn't do so for skein, edonr, and blake3.

So, the way forward will be:

  1. Remove the DMU_BACKUP_FEATURE_BLAKE3 flag as it's unused
  2. Add the comment about single authoritative zio_checksum enum

PR https://github.com/openzfs/zfs/pull/13796 will be rebased afterwards.

rincebrain commented 1 year ago

Forgive me if I'm missing something, but how do you get a different checksum on the wire in the send stream? AFAIK regardless if the property value, we always use fletcher4 for the send stream bits.

behlendorf commented 1 year ago

AFAIK regardless if the property value, we always use fletcher4 for the send stream bits.

That's right, the potential concern here is a little different. The send stream also includes the checksum type used on the source. This checksum type gets verified in receive_object() so it's just important that all implementations agree on how the zio_checksum enum is defined. If an implementation were to add a new checksum type and not reserve it upstream the wrong algorithm could be used for the checksums. That wouldn't result in a blake3 checksum being smuggled in but it would be unexpected.

drrw->drr_checksumtype = BP_GET_CHECKSUM(bp);
rincebrain commented 1 year ago

Ah, sure, I had just thought that the primary concern was a block ending up on disk with an invalid checksum stored, not "just" a different checksum used, and didn't see how you could get there.

behlendorf commented 1 year ago

That's a good question, looking at the code I don't see a check for that in receive_object(). However, there in an ASSERT in dmu_object_set_checksum() function which is only called by the zfs recv code which is intended to enforce only legacy checksum algorithms be set in the send stream.

problame commented 1 year ago

Sorry @behlendorf , I don't fully get your last comment.

The scenario that we've identified is that both sender and receiver use a given enum value, but for different checksums. Like so:

Sender:

enum zio_checksum {
    ZIO_CHECKSUM_A,
    ZIO_CHECKSUM_B,
    ZIO_CHECKSUM_FUNCTIONS
};

Receiver:

enum zio_checksum {
    ZIO_CHECKSUM_A,
    ZIO_CHECKSUM_C,
    ZIO_CHECKSUM_FUNCTIONS
};

If now the sender emits an object record with drr_checksumtype = ZIO_CHECKSUM_B, then the receiver will interpret this as ZIO_CHECKSUM_C.

The receiver will call dmu_object_set_checksum in receive_object. And subsequent receive_write will indirectly lead to dsl_dataset_block_born being called. That function will activate the feature if it's enabled, and panic the system if the feature is disabled. Either way (activation or panic) is unexpected to the user, and undesirable behavior.

Now, if the receiver doesn't have a ZIO_CHECKSUM_C, then we'll hit the assert in dmu_object_set_checksum, but only if it's a debug build. If it's a release build, the asserts are ignored and dn_checksum will be set to the value of ZIO_CHECKSUM_B, which is invalid on the receiver. The question is: is there some later VERIFY that will ensure a panic, or will be sync out the invalid dn_checksum to the dnode_phys_t .

problame commented 1 year ago

That function will activate the feature if it's enabled, and panic the system if the feature is disabled.

Correction: maybe we don't panic if the feature is disabled , but we probably should.