`reduce_runs` may execute too long time

MichaelScofield commented 5 days ago

What type of bug is this?

Performance issue

What subsystems are affected?

Datanode

Minimal reproduce step

The method reduce_runs (in https://github.com/GreptimeTeam/greptimedb/blob/027284ed1b46a02375b9e08341f54e4a27ce0ed5/src/mito2/src/compaction/run.rs#L250) iterates over the combinations of runs: Vec<SortedRun<T>> on let k = runs.len() + 1 - target;. The time complexity of calculating combinations is C(n, k), which could be very time consuming to run. For example, C(10, 4) = 330791175. Upon iterating this large combinations, the codes may execute too long time.

I've observed this situation in one of our customers. He has too many sst files that are time ranges overlapped. The runs passed to reduce_runs are large. See this log (not in our codes but I temporarily added for debugging) right before executing reduce_runs:

mito2::compaction::twcs region: 15964393439232(3717, 0), found_runs: 2883, max_runs: 4, max_files: 4

So the combinations in his sorted_runs are C(2883, 5) = 1653997252262256, which is too big to iterate. The code's execution acts like hanging forever.

What did you expect to see?

code execution proceeded

What did you see instead?

code execution hang

What operating system did you use?

ubuntu

What version of GreptimeDB did you use?

main

Relevant log output and stack trace

No response

killme2008 commented 5 days ago

Where is reduce_runs? I think it would be better to add the code link in issue.

MichaelScofield commented 5 days ago

Right now I do a quick and dirty walkaround by taking the first 1024 of combinations, like this:

...
runs.into_iter()
    .combinations(k)

    .take(1024) // here

    .map(|runs_to_merge| merge_all_runs(runs_to_merge))
...

It may produce not minimal penalty result, but the code execution can proceed. I can polish it a bit, but WDYT about the idea? @evenyag @v0y4g3r

v0y4g3r commented 4 days ago

Normally the combination number won't be that large, C(10,4) is 210. The problem is why there're 2883 runs. can your provide the time ranges of all files?

MichaelScofield commented 4 days ago

@v0y4g3r The ssts were ingested directly, from early days before this reduce_runs feature was added to our codes.

v0y4g3r commented 4 days ago

@v0y4g3r The ssts were ingested directly, from early days before this reduce_runs feature was added to our codes.

Looks like ingesting large number of files with overlapping time ranges is not the original use case of any existing compaction strategies that expect overlappping files gradually get merged into one or serveral files as they age. We should come up with a simpler strategy that always merge them into one if theire size does not exceed some threshold.

MichaelScofield commented 17 hours ago

Only early greptimedb that don't do compaction after ingesting ssts are affected, not a big issue.

GreptimeTeam / greptimedb