tonbo-io / tonbo

A portable embedded database using Arrow.
https://tonbo.io
Apache License 2.0
606 stars 42 forks source link

Major compaction write amplification #151

Open crwen opened 6 days ago

crwen commented 6 days ago

Bug Report

image What did you do?

As illustrated in the picture, when flush, mem-table will be flushed to level0 and trigger major compaction.

[7-100] involved twice, which will cause write and space amplification

My idea is to use a compact_status to record sstables that involved into compaction, and remove them in the next compaction

        let mut compact_status = HashSet::new();

        while level < MAX_LEVEL - 2 {
            if !option.is_threshold_exceeded_major(version, level) {
                break;
            }
            let (mut meet_scopes_l, start_l, end_l) =
                Self::this_level_scopes(version, min, max, level);
            let (mut meet_scopes_ll, start_ll, end_ll) =
                Self::next_level_scopes(version, &mut min, &mut max, level, &meet_scopes_l)?;

            // remain sstables that not involved in the compaction
            meet_scopes_l.retain(|scope| compact_status.insert(scope.gen));
            meet_scopes_ll.retain(|scope| compact_status.insert(scope.gen));

            // do compaction ......

            level += 1;
        }