Closed byrnedj closed 1 year ago
How about running some benchmarks for smaller evictorThreads number? When delayOps/ratePerSec is used we should not need to much threads.
Also, it would be great if we could print the percent of allocated slabs and classes/pools during the benchmark (I did not implement this yet). This would allow us to see if we can keep up with eviction and maintain the specified watermarks. Based on this info, we can later adjust number of threads (we should probably use the lowest possible amount which can keep up).
Also, I would like to wait with performing those benchmarks until we actually merge evition/promotion to develop branch. We could use some additional review for that: https://github.com/pmem/CacheLib/pull/90
@byrnedj there are some TODO items in that PR which we should address first. Also, we should do a thorough review of those changes: could you get someone extra involved in the review?
The goal of this experiment is to so the progress towards single tier DRAM performance parity. We also investigate some initial background data movement parameters.
Background Movement
The branch to use is here: https://github.com/byrnedj/CacheLib/tree/mt-background-movers
The cache configuration parameters to test are the following:
-
backgroundEvictorIntervalMilSec:
0, 10 -evictorThreads
: 4 , 12 -evictionHotnessThreshold
: 40, 200 -lowEvictionAcWatermark
: 91, 98 -highEvictionAcWatermark
: 88, 95Since highEvictionAcWatermark must be less than lowEvictionAcWaterMark, then the combinations are the following (all with background evictor milSec = 10 and DRAM:PMEM ratio 1:4):
evictorThreads
: 4,evictionHotnessThreshold
: 40,lowEvictionAcWatermark
: 91,highEvictionAcWatermark
: 88evictorThreads
: 4,evictionHotnessThreshold
: 40,lowEvictionAcWatermark
: 98,highEvictionAcWatermark
: 88evictorThreads
: 4,evictionHotnessThreshold
: 40,lowEvictionAcWatermark
: 98,highEvictionAcWatermark
: 95evictorThreads
: 4,evictionHotnessThreshold
: 200,lowEvictionAcWatermark
: 91,highEvictionAcWatermark
: 88evictorThreads
: 4,evictionHotnessThreshold
: 200,lowEvictionAcWatermark
: 98,highEvictionAcWatermark
: 88evictorThreads
: 4,evictionHotnessThreshold
: 200,lowEvictionAcWatermark
: 98,highEvictionAcWatermark
: 95evictorThreads
: 12,evictionHotnessThreshold
: 40,lowEvictionAcWatermark
: 91,highEvictionAcWatermark
: 88evictorThreads
: 12,evictionHotnessThreshold
: 40,lowEvictionAcWatermark
: 98,highEvictionAcWatermark
: 88evictorThreads
: 12,evictionHotnessThreshold
: 40,lowEvictionAcWatermark
: 98,highEvictionAcWatermark
: 95evictorThreads
: 12,evictionHotnessThreshold
: 200,lowEvictionAcWatermark
: 91,highEvictionAcWatermark
: 88evictorThreads
: 12,evictionHotnessThreshold
: 200,lowEvictionAcWatermark
: 98,highEvictionAcWatermark
: 88evictorThreads
: 12,evictionHotnessThreshold
: 200,lowEvictionAcWatermark
: 98,highEvictionAcWatermark
: 95Then we have two baselines runs:
backgroundEvictorIntervalMilSec:
0, with DRAM:PMEM set to 1:4backgroundEvictorIntervalMilSec:
0,evictorThreads
: 4,evictionHotnessThreshold
: 40,lowEvictionAcWatermark
: 98,highEvictionAcWatermark
: 95 with DRAM:PMEM set to 1:1For a total of 14 experiments per workload configuration.
The workload configuration parameters are:
-
opRatePerSec
: 500000, none -htBucketPower
28 (for both workloads) -48 threads and 5M ops/thread -workload: leader_obj, follower_objThere are 4 different workload configurations, so the total number of background evictor experiments is 56.
Additional baselines
For comparison we need to test the additional baseline systems for each workload configuration:
-https://github.com/igchor/CacheLib-1/tree/issue75_rebased using the DRAM:PMEM 1:1 -https://github.com/igchor/CacheLib-1/tree/issue75_rebased using the DRAM:PMEM 1:4 -https://github.com/igchor/CacheLib-1/tree/issue75_rebased using PMEM only -https://github.com/igchor/CacheLib-1/tree/issue75_rebased using DRAM only
There are 4 different workload configurations, so the total number of additional baseline system experiments is 16.
In total there are 72 experiments - each should last around 5 minutes so total execution time is estimated to be 6 hours.