Modifies the disk format layer to facilitate easier compaction. The specific changes to the disk format are detailed below.
Revision for Serialization/Deserialization:
Implements the use of revisions to serialize and deserialize options on disk as manifest changes.
Addition of Compact Method:
Adds a compact method to handle compaction.
Vart Upgrade:
Upgrades Vart to resolve critical bugs related to the deletion of entries.
This is a breaking change to change disk format to make compaction easy. The current disk format is like this
Tx struct encoded format:
TxHeader Encoded Format:
Field
Length (bytes)
Description
crc
4
id
8
ts
8
version
2
num_entries
2
metadata_len
2
metadata
Variable
TxEntry[]
Variable
Array of TxEntry objects
TxEntry Encoded Format:
Field
Length (bytes)
Description
metadata_len
4
metadata
Variable
key_len
4
key
Variable
value_len
4
value
Variable
crc32
4
Doing compaction on this becomes hard, because each entry cannot be read individually to compact and merge. The new proposed format is simple. Each entry is formatted and stored separately. Just as a series of records.
Description
This PR includes the following changes:
Disk Format Layer Changes:
Revision for Serialization/Deserialization:
Addition of Compact Method:
compact
method to handle compaction.Vart Upgrade:
This is a breaking change to change disk format to make compaction easy. The current disk format is like this
Tx struct encoded format:
TxHeader Encoded Format:
crc
id
ts
version
num_entries
metadata_len
metadata
TxEntry[]
TxEntry Encoded Format:
metadata_len
metadata
key_len
key
value_len
value
crc32
Doing compaction on this becomes hard, because each entry cannot be read individually to compact and merge. The new proposed format is simple. Each entry is formatted and stored separately. Just as a series of records.
Record encoded format:
Record encoded format:
This will make it easy to read each entry and do compaction