Closed simerplaha closed 3 years ago
Compaction currently performs merge
in parallel on each Segment and then persists and then commit.
Parallel merge
on large compaction jobs leads heap overflow which results in slower compaction. Persisting Segments while the merge is progress (partially complete) is required so that we free memory ASAP.
A
CompactionIO - Actor
is required which controls read and write IO concurrency performed during Compaction.Concurrent IO on a machine with single disk will have significant performance impact so using an Actor we will be able to control concurrency by only allowing sequential IO on a single disk and concurrent on multiple disks (using the file path).