openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.56k stars 1.74k forks source link

ARC adapation logic is broken and fix was not imported from FreeBSD to OpenZFS. #10548

Closed blacklion closed 4 years ago

blacklion commented 4 years ago

ARC is one of main parts of ZFS success, as it is state-of-art cache algorithm. Its novelty is using of additional «ghost» LRU and MRU lists which remember evicted items and help to tune LRU/MRU balance. Center part of ARC algorithm is arc_adap() function which tune LRU/MLU balance according to 4 types of cache hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU. If this function will be called with wrong cache hit (state) adaptation will be sub-optimal and performance will suffer.

Some (long) time ago upstream had been received this commit:

6950 ARC should cache compressed data) in arc_read() do next sequence (access to ghost buffer)

Before this commit, hit to any ghost list was passed arc_adapt() before call to arc_access() which revive element in cache and change state from ghost to real hit (it is very important!)

After this commit order of calls was reverted and arc_adapt() is now called only with «real» hits even if hit was in one of two ghost lists, which render ghost lists useless and break ARC algorithm.

FreeBSD fixes this problem locally in Change D19094 / Commit r348772.

This fix have not been ported to upstream, ZoL or OpenZFS, unfortunately.

Current OpenZFS contains same bug, though patch is not applicable, as low-level ARC routines were extended, which is not present in FreeBSD sources.

Crucial change is this one, in arc_get_data_impl. All other changes are support this one, to weave proper behavior (adapt or not adapt before changing status of cache element) from call sites of arc_hdr_alloc_pabd() (which is named arc_hdr_alloc_abd() in OpenZFS).

Without this change ARC in OpenZFS is not ARC at all, and there could be serious performance degradation in some scenarios.

I think, it should be fixed before FreeBSD transition to OpenZFS and this fix will be beneficial for whole OpenZFS community.

ghost commented 4 years ago

To be clear, this isn't actually in progress, so if anyone is willing to take it up please do :)

mattmacy commented 4 years ago

And to be clear, I'd asked @blacklion to file a PR, not an issue.

blacklion commented 4 years ago

And to be clear, I'd asked @blacklion to file a PR, not an issue.

And to be honest, PR could be understood as «problem report», especially by person from FreeBSD, especially by one who have been used to send-pr script ;-)

Jokes aside, codebases have been diverged and fix needs complete rewrite. I don't have time RIGHT NOW to make it and, what is worse, I don't know how to properly test it, as I'm not a author of original fix and know nothing about OpenZFS testing.

On the other hand, original fix is ugly kludge (and author acknowledges this, it is not that I try to shade Slawa), maybe OpenZFS community could offer something better? For example, author of all changes in arc.c which make original patch inapplicable. They must understand this code much better than me.

I could try to port original fix to OpenZFS's arc.c as-is, but it will take some more time, as I need to meet some deadlines at my $work.

blacklion commented 4 years ago

BTW, original fix will be even worse on new codebase: it will add second boolean parameter to some functions (it adds first one in FreeBSD codebase, but OpenZFS added its own boolean parameter to same functions!). It is nightmare from API's point of view: it is very easy to mix two boolean parameters at call sites, as booleans are indistinguishable.

mattmacy commented 4 years ago

@blacklion I'll update the interfaces to take a flag

blacklion commented 4 years ago

@mattmacy Thank you very much!

shodanshok commented 3 years ago

Any chances to see this backported in the 0.8.x releases?

behlendorf commented 3 years ago

Possibly if it's not to troublesome to port, the PR was added to the 0.8 tracker.