RSS comparison for reduced critical section

Our shorter critical section during will perform better than the longer upstream version when the hashtable is small since it will not iterate through hashtable collisions with the mmContainer lock held. Performance gains with a shorter critical section tend to dissapate as hashtable size increases since the longer upstream version will be iterating through less collision entries (and therefore the critical section is not as long with higher hashtable power).

The purpose of this experiment is to examine the tradeoff of using shorter critical section with smaller hashtables against the upstream longer critical section with a larger hashtable. In our experiement we limit the total RSS of cachebench to 16GB (including cache size and hashtable overhead).

Configs

Config 1 use (upstream) - set cache size to 8GB and htBucketPower to 30
Config 2 use (upstream) - set cache size to 15.75GB and htBucketPower to 25
Config 3 use (critical section patch) - set cache size to 8GB and htBucketPower to 30
Config 4 use (critical section patch) - set cache size to 15.75GB and htBucketPower to 25

Use graph_cache_leader_fbobj and graph_cache_follower_fbobj workloads with default parameters other than the above. Therefore there will be 8 experiments in total: 2 workloads * 4 different configs.

Upstream: https://github.com/facebook/CacheLib/tree/main, commit https://github.com/pmem/CacheLib/commit/ba170d0e10af6f912c0992d8b66ecb7e2c830b8b

edit: use CacheLib branch here: https://github.com/igchor/CacheLib-1/tree/optimize_mmcontainer_locking?

pmem / CacheLib

RSS comparison for reduced critical section #86