cockroachdb / pebble

RocksDB/LevelDB inspired key-value database in Go
BSD 3-Clause "New" or "Revised" License
4.88k stars 449 forks source link

perf: investigate reenabling flush/compaction pacing #687

Open petermattis opened 4 years ago

petermattis commented 4 years ago

Flush/compaction pacing was disabled in https://github.com/cockroachdb/pebble/pull/431 because it had an apparent effect on throughput. Some recent experiments with pebble bench ycsb show a throughput benefit to pacing (old is with pacing disabled, new is with pacing enabled).

name        old ops/sec  new ops/sec  delta
update_100    439k ± 2%    524k ± 2%  +19.46%  (p=0.008 n=5+5)

name        old p50      new p50      delta
update_100    0.30 ± 0%    0.30 ± 0%     ~     (all equal)

name        old p90      new p90      delta
update_100    1.66 ± 4%    1.40 ± 0%  -15.66%  (p=0.000 n=5+4)

name        old p99      new p99      delta
update_100    4.88 ± 4%    2.28 ± 5%  -53.28%  (p=0.008 n=5+5)

name        old pMax     new pMax     delta
update_100     160 ±62%      30 ± 7%  -81.01%  (p=0.016 n=5+4)

Interestingly, compaction pacing was very quickly disabled in these tests because compactions were not keeping up. The above tests were for only 1m in duration. Longer runs need to be performed. Also need to look at scenarios involving reads and writes.

A lot has changed since mid-December when #431 was merged, so perhaps the true culprit was not pacing. Or perhaps we've fixed other bottlenecks so that pacing is no longer a problem.

Jira issue: PEBBLE-175

petermattis commented 4 years ago

Here is a 10m run with pacing enabled:

____optype__elapsed_____ops(total)___ops/sec(cum)__keys/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
update_100   600.0s      295399650       492331.3       492331.4      0.5      0.2      1.6      2.6     33.6

And a 10m run with pacing disabled:

____optype__elapsed_____ops(total)___ops/sec(cum)__keys/sec(cum)__avg(ms)__p50(ms)__p95(ms)__p99(ms)_pMax(ms)
update_100   600.0s      243737974       406229.5       406229.9      0.6      0.3      1.8      6.3    104.9
petermattis commented 4 years ago

Ah, pacing causes problems with a workload composed of reads and writes. Here is ycsb A which is 50% reads and 50% updates. Old is with pacing disabled, new is with pacing enabled:

name     old ops/sec  new ops/sec  delta
ycsb/A     847k ± 1%    633k ± 1%  -25.27%  (p=0.000 n=10+9)

I have a strong suspicion the problem is that pacing increases read amplification. I added a bit of instrumentation to Iterator so that it can report read amplification. With pacing disabled the average read amplification across iterators was 5.8. With pacing enabled the average read amplification was 8.7. This would also explain why pacing speeds up write-only workloads: we're allowing the read amplification to increase which is generally good for write performance.

sumeerbhola commented 3 years ago

With pacing disabled the average read amplification across iterators was 5.8. With pacing enabled the average read amplification was 8.7. This would also explain why pacing speeds up write-only workloads: we’re allowing the read amplification to increase which is generally good for write performance.

This makes me more convinced that any pacing that does not take into account whether we have an actual CPU or IO resource bottleneck is probably making unnecessary tradeoffs (related to #1329). In this example:

jbowens commented 2 years ago

Is this still worthwhile as an issue separate from #1329?

itsbilal commented 2 years ago

@jbowens I would say yes - it makes sense as a sub-issue of that. This one is more general about testing the implications of enabling pacing on its own, while that one is about potentially using pacing in IO-constrained situations (assuming it is beneficial in prioritizing user-writes or other compactions). I can see #1329 wrapping up without much in the way of using pacing, in which case this could stand as a remaining bit of exploration/work.

github-actions[bot] commented 1 year ago

We have marked this issue as stale because it has been inactive for 18 months. If this issue is still relevant, removing the stale label or adding a comment will keep it active. Otherwise, we'll close it in 10 days to keep the issue queue tidy. Thank you for your contribution to Pebble!