openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.69k stars 1.76k forks source link

ZFS send should use spill block prefetched from send_reader_thread #16701

Closed tuxoko closed 3 weeks ago

tuxoko commented 1 month ago

Motivation and Context

Currently, even though send_reader_thread prefetches spill block, do_dump() will not use it and issues its own blocking arc_read. This may cause significant performance degradation when sending datasets with lots of spill blocks.

For unmodified spill blocks, we also create send_range struct for them in send_reader_thread and issue prefetches for them. We piggyback them on the dnode send_range instead of enqueueing them so we don't break send_range_after check.

On one of our setup, we see at least 5x difference during affected areas.

Description

How Has This Been Tested?

Manual send/recv on dataset with spill block. Manual raw send/recv on encrypted dataset with spill block.

Types of changes

Checklist:

tuxoko commented 4 weeks ago

Not sure why tests are showing failure even though the test results seems mostly fine. Rebase and rerun the tests.

amotin commented 4 weeks ago

Crashed on assertion:

  [ 5544.971043] VERIFY(dscp->dsc_dso->dso_dryrun || srdp->abuf != NULL || srdp->abd != NULL) failed
  [ 5544.974720] PANIC at dmu_send.c:980:do_dump()
tuxoko commented 4 weeks ago

Crashed on assertion:

  [ 5544.971043] VERIFY(dscp->dsc_dso->dso_dryrun || srdp->abuf != NULL || srdp->abd != NULL) failed
  [ 5544.974720] PANIC at dmu_send.c:980:do_dump()

Hmm... This doesn't show up in our internal CI for some reason.

Edit: Ok, I think it's because we turn off zfs_send_unmodified_spill_blocks

tuxoko commented 3 weeks ago

Updated to prefetch also unmodified spill block in send_reader_thread.

behlendorf commented 3 weeks ago

The CI results are looking much better. @tuxoko if you rebase on the latest master it should resolve the Fedora 41 failure.