Closed isaac-io closed 1 year ago
Pull request: https://github.com/speedb-io/speedb/pull/164
@Guyme The ticket as was actually completed but abandoned. So, I suggest creating a new ticket once we actually know what we want to do here, and how it fits with other delayed write activities we work on.
@erez-speedb , plz make sure theres no degradation with branch - dirty-mem-connect-wbm-to-global-delay. i'll run the performance scenario which shows benefit.
Perf test passed, same performance and memory consumption as 2.4.1 All test were done with WF disabled.
comparing main branch (e7e2de7d75cba503c301397a9681f861467c67d3) vs this branch (ba6a3de5336936e3d0b080be58bd75ec8c3749a9)
cmd:
./db_bench --compression_type=None -db=/data/ -num=200000000 -value_size=1000 -key_size=16 --delayed_write_rate=536870912 -report_interval_seconds=1 -max_write_buffer_number=4 -num_column_families=6 -histogram -max_background_compactions=8 -cache_size=8388608 -max_background_flushes=1 -bloom_bits=10 -benchmark_read_rate_limit=0 -benchmark_write_rate_limit=0 -report_file=fillrandom.csv --disable_wal=true --benchmarks=fillrandom,levelstats --db_write_buffer_size=1073741824 --allow_wbm_stalls=true --use_spdb_writes=false --initiate_wbm_flushes=false -write_buffer_size=134217728
results:
quantitative - almost 50% improvement in stability: ops/sec (std) main - 89981 this branch - 49333
same Ops/sec (mean) main - 255700 this branch - 245854
platform azure standard_L16s_V3 instance "cpu": "Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz", "num_cpu": 16, "memory": "128G" "disk": single 1.8T NVME
Currently the
WriteBufferManager
(WBM) does not allow slowing down writes in response to memory usage getting close to the prescribed limit. The only mechanism for letting flushes catch up is stopping writes completely by passing true for theallow_stall
parameter of the WBM constructor. This can lead to oscillation between full write rate and complete stop, which is undesirable as it affects the latency of user writes.Allow slowing down writes based on memory consumption by having the WBM signal the
WriteController
(WC) (there can be more than one) of the delay requirement. The delay requirement is stored in the WC as part of the Global Delay feature (#346). The same way a CF signals the WC that it has a delay requirement.To enable this feature:
allow_delays_and_stalls = true
. in the ctor of WriteBufferManager (previously this flag was namedallow_stalls
)use_dynamic_delay = true
.The way the delay requirements are calculated is as follows:
The WBM reports a delay once its memory consumption passes a certain threshold from the quota. That threshold can be controlled by passing
start_delay_percent
to the ctor of the WBM. The default value is 70. Which means that the WBM will start issuing delay requests once the memory consumption of the WBM reaches 70% of its quota. The delay is linear throughout the range from threshold to the max quota. The range from start of delay to the quota is divided into 99 steps of delay. (kMaxDelayedWriteFactor
- 1). E.g. in the 1st step, the delay requirement will be 99/100 max_delayed_write_rate() and the last step (when the memory almost reached the quota) will result in a delay requirement of 1/100 max_delayed_write_rate(). max_delayed_write_rate() is the rate the user passed todelayed_write_rate
(DBOptions) which can also be dynamically changed.note:
The stall logic in the WBM is redundant since the WC already includes logic for stopping writes which can be reused. For the first phase, #423 , keep using the stall logic in the WBM (ShouldStall() and WBMStallWrites()) and only add the mechanism for slowing down writes.