openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.68k stars 1.76k forks source link

ZAP: Reduce leaf array and free chunks fragmentation #16766

Closed amotin closed 5 days ago

amotin commented 1 week ago

Previous implementation of zap_leaf_array_free() put chunks on the free list in reverse order. Also zap_leaf_transfer_entry() and zap_entry_remove() were freeing name and value arrays in reverse order. Together this created a mess in the free list, making following allocations much more fragmented than necessary.

This patch re-implements zap_leaf_array_free() to keep existing chunks order, and implements non-destructive zap_leaf_array_copy() to be used in zap_leaf_transfer_entry() to allow properly ordered freeing name and value arrays there, as in zap_entry_remove().

With this change test of some writes and deletes shows percent of non-contiguous chunks in DDT reducing from 61% and 47% to 0% and 17% for arrays and frees respectively. Sure some explicit sorting could do even better, especially for ZAPs with variable-size arrays, but it would also cost much more, while this should be very cheap.

Another improvement is that previously zap_entry_update() for multi-chunk values always reverted chunk order, changing the leaf block even if nothing has actually changed. I don't know if we can benefit from the block not changing via nop-write or something, but it should not harm to be more predictable.

Types of changes

Checklist: