surrealdb / surrealkv

A low-level, versioned, embedded, ACID-compliant, key-value database for Rust
https://surrealdb.com
Apache License 2.0
293 stars 18 forks source link

Change disk format for compaction #45

Closed arriqaaq closed 4 months ago

arriqaaq commented 4 months ago

Description

This PR includes the following changes:

  1. Disk Format Layer Changes:

    • Modifies the disk format layer to facilitate easier compaction. The specific changes to the disk format are detailed below.
  2. Revision for Serialization/Deserialization:

    • Implements the use of revisions to serialize and deserialize options on disk as manifest changes.
  3. Addition of Compact Method:

    • Adds a compact method to handle compaction.
  4. Vart Upgrade:

    • Upgrades Vart to resolve critical bugs related to the deletion of entries.

This is a breaking change to change disk format to make compaction easy. The current disk format is like this

Tx struct encoded format:

TxHeader Encoded Format:

Field Length (bytes) Description
crc 4
id 8
ts 8
version 2
num_entries 2
metadata_len 2
metadata Variable
TxEntry[] Variable Array of TxEntry objects

TxEntry Encoded Format:

Field Length (bytes) Description
metadata_len 4
metadata Variable
key_len 4
key Variable
value_len 4
value Variable
crc32 4

Doing compaction on this becomes hard, because each entry cannot be read individually to compact and merge. The new proposed format is simple. Each entry is formatted and stored separately. Just as a series of records.

Record encoded format:

Record
crc32: u32
version: u16
tx_id: u64
ts: u64
md: Option
key_len: u32
key: Bytes
value_len: u32
value: Bytes

Record encoded format:

+----------+------------+------------+----------+-----------------+------------+------------+------+--------------+-------+
| crc32(4) | version(2) |  tx_id(8)  |  ts(8)   | metadata_len(2) |  metadata  | key_len(4) | key  | value_len(4) | value |
+----------+------------+------------+----------+-----------------+------------+------------+------+--------------+-------+

This will make it easy to read each entry and do compaction