CacheLib Performance Parity Benchmarking (w/ background threads)

byrnedj commented 2 years ago

The goal of this experiment is to so the progress towards single tier DRAM performance parity. We also investigate some initial background data movement parameters.

Background Movement

The branch to use is here: https://github.com/byrnedj/CacheLib/tree/mt-background-movers

The cache configuration parameters to test are the following:

-backgroundEvictorIntervalMilSec: 0, 10 -evictorThreads: 4 , 12 -evictionHotnessThreshold: 40, 200 -lowEvictionAcWatermark: 91, 98 -highEvictionAcWatermark: 88, 95

Since highEvictionAcWatermark must be less than lowEvictionAcWaterMark, then the combinations are the following (all with background evictor milSec = 10 and DRAM:PMEM ratio 1:4):

evictorThreads: 4, evictionHotnessThreshold: 40, lowEvictionAcWatermark: 91, highEvictionAcWatermark: 88 evictorThreads: 4, evictionHotnessThreshold: 40, lowEvictionAcWatermark: 98, highEvictionAcWatermark: 88 evictorThreads: 4, evictionHotnessThreshold: 40, lowEvictionAcWatermark: 98, highEvictionAcWatermark: 95 evictorThreads: 4, evictionHotnessThreshold: 200, lowEvictionAcWatermark: 91, highEvictionAcWatermark: 88 evictorThreads: 4, evictionHotnessThreshold: 200, lowEvictionAcWatermark: 98, highEvictionAcWatermark: 88 evictorThreads: 4, evictionHotnessThreshold: 200, lowEvictionAcWatermark: 98, highEvictionAcWatermark: 95

evictorThreads: 12, evictionHotnessThreshold: 40, lowEvictionAcWatermark: 91, highEvictionAcWatermark: 88 evictorThreads: 12, evictionHotnessThreshold: 40, lowEvictionAcWatermark: 98, highEvictionAcWatermark: 88 evictorThreads: 12, evictionHotnessThreshold: 40, lowEvictionAcWatermark: 98, highEvictionAcWatermark: 95 evictorThreads: 12, evictionHotnessThreshold: 200, lowEvictionAcWatermark: 91, highEvictionAcWatermark: 88 evictorThreads: 12, evictionHotnessThreshold: 200, lowEvictionAcWatermark: 98, highEvictionAcWatermark: 88 evictorThreads: 12, evictionHotnessThreshold: 200, lowEvictionAcWatermark: 98, highEvictionAcWatermark: 95

Then we have two baselines runs:

backgroundEvictorIntervalMilSec: 0, with DRAM:PMEM set to 1:4
backgroundEvictorIntervalMilSec: 0, evictorThreads: 4, evictionHotnessThreshold: 40, lowEvictionAcWatermark: 98, highEvictionAcWatermark: 95 with DRAM:PMEM set to 1:1

For a total of 14 experiments per workload configuration.

The workload configuration parameters are:

-opRatePerSec: 500000, none -htBucketPower 28 (for both workloads) -48 threads and 5M ops/thread -workload: leader_obj, follower_obj

There are 4 different workload configurations, so the total number of background evictor experiments is 56.

Additional baselines

For comparison we need to test the additional baseline systems for each workload configuration:

-https://github.com/igchor/CacheLib-1/tree/issue75_rebased using the DRAM:PMEM 1:1 -https://github.com/igchor/CacheLib-1/tree/issue75_rebased using the DRAM:PMEM 1:4 -https://github.com/igchor/CacheLib-1/tree/issue75_rebased using PMEM only -https://github.com/igchor/CacheLib-1/tree/issue75_rebased using DRAM only

There are 4 different workload configurations, so the total number of additional baseline system experiments is 16.

In total there are 72 experiments - each should last around 5 minutes so total execution time is estimated to be 6 hours.

igchor commented 2 years ago

How about running some benchmarks for smaller evictorThreads number? When delayOps/ratePerSec is used we should not need to much threads.

Also, it would be great if we could print the percent of allocated slabs and classes/pools during the benchmark (I did not implement this yet). This would allow us to see if we can keep up with eviction and maintain the specified watermarks. Based on this info, we can later adjust number of threads (we should probably use the lowest possible amount which can keep up).

igchor commented 2 years ago

Also, I would like to wait with performing those benchmarks until we actually merge evition/promotion to develop branch. We could use some additional review for that: https://github.com/pmem/CacheLib/pull/90

@byrnedj there are some TODO items in that PR which we should address first. Also, we should do a thorough review of those changes: could you get someone extra involved in the review?

pmem / CacheLib