Open Little-Wallace opened 2 years ago
cc @Li0k
We can found that the compaction job in level6 cost most throughput. And according the following flamegraph show that LZ4-compression cost most CPU resource. Since we use dynamic-compression, when the number of level is no more than 4, we only compress data in level 6 while level 4 and level 5 do not compress any data.
For compaction without compression, I found most CPU resource cost on S3-sdk.
For compaction without compression, I found most CPU resource cost on S3-sdk.
In this pic, we found that sha256 cost a lot of CPU, ~maybe we should migrate sha256 to cheaper checksum algorithm, like CRC32C.~
I found that cost of sha256 from authorization, ref this doc.
Maybe we should try with UNSIGNED-PAYLOAD
, ref this doc
level_multiplier
of the lower layer, write bytes of this level is reduced as expectedcompaction_throughput
, we found that the data writing in the bench process increasedcompaction throughput
in the bench process will not decreasebranch (18:30 ~ 19:10)
main (19:40 ~ 20:20)
we can observer that, the data will accumulate to the upper layer(in this test case, L4 -> L3), and the compaction throughput in the bench process will not decrease
In the test scenario of a single compactor, the read performance will be affected by compaction. Maybe we can consider restricting some behaviors of compact
let target_pending_task_count =
level_handlers[target_level].pending_tasks_ids().len();
if target_pending_task_count >= 4 {
tracing::info!(
"pick_compaction pending_task deny select_level {} target_level {} target_pending_task_count {}",
select_level, target_level, target_pending_task_count
);
continue;
}
let select_size: u64 = ret.input_levels[0]
.table_infos
.iter()
.map(|table_info| table_info.get_file_size())
.sum();
let target_size: u64 = ret.input_levels[1]
.table_infos
.iter()
.map(|table_info| table_info.get_file_size())
.sum();
let write_amplification = (select_size + target_size) * 100 / select_size;
tracing::info!(
"pick_compaction select_level {} target_level {} select_size {} target_size {} write_amplification {}",
select_level, target_level, select_size, target_size, write_amplification
);
if write_amplification > 300 {
tracing::info!(
"pick_compaction write_amplification deny select_level {} target_level {} select_size {} target_size {} write_amplification {}",
select_level, target_level, select_size, target_size, write_amplification
);
continue;
}
main (18:00 ~ 18:30)
branch (17:25 ~ 17:50)
Compaction all
Compaction bottommost-level write
Compaction write throughput
Node Cpu
Is your feature request related to a problem? Please describe.
As title.
We can see that most compaction flow is in bottommost-level.
Describe the solution you'd like
max_bytes_for_level_multiplier
dynamically.Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context Add any other context or screenshots about the feature request here.