sbtourist / Journal.IO

Journal.IO is a zero-dependency, fast and easy-to-use journal storage implementation.
Apache License 2.0
260 stars 39 forks source link

Compaction should create multiple batches #56

Open sbtourist opened 10 years ago

sbtourist commented 10 years ago

Currently, compaction creates one large batch per compacted data file, which is sub-optimal as it reduces hints efficiency and can amplify corruption cases as it creates a long chain of CRC.

We should batch writes following the same batching scheme as in the pre-compacted file: that is, each set of non-compacted locations should be contained in a batch, creating the same "grouping" as before.

Example.

Given a file with the following Batches/Locations: B1(L2,L3,L4) B5(L6,L7) B8(L9) If L4, L6 and L9 are deleted and compacted, the result should be: B1(L2,L3) B4(L7)