tursodatabase / libsql

libSQL is a fork of SQLite that is both Open Source, and Open Contributions.
https://turso.tech/libsql
MIT License
9.54k stars 252 forks source link

Segment compactor #1723

Closed MarinPostma closed 2 weeks ago

MarinPostma commented 2 weeks ago

This PR adds an MVP for the compactor process in the form of a shell. It's not feature-complete but can perform compaction according to basic strategies. We'll expand strategies and functionality later on

It is exposed as a new cli bin:

cargo r --bin compactor -- --help
   Compiling libsql-wal v0.1.0 (/Users/mpostma/Documents/code/turso/libsql/libsql-wal/libsql-wal)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 4.01s
     Running `target/debug/compactor --help`
Usage: compactor [OPTIONS] <COMMAND>

Commands:
  monitor  Register namespaces to monitor
  analyze  Analyze segments for a namespaces
  compact  Compact segments into bigger segments
  sync     Sync namespace metadata from remote storage
  restore  Restore namespace
  help     Print this message or the help of the given subcommand(s)

Options:
  -p, --path <PATH>                          [default: compactor]
      --enable-s3
      --cluster-id <CLUSTER_ID>              [env: LIBSQL_BOTTOMLESS_DATABASE_ID=]
      --s3-url <S3_URL>                      [env: LIBSQL_BOTTOMLESS_ENDPOINT=]
      --s3-access-key <S3_ACCESS_KEY>        [env: LIBSQL_BOTTOMLESS_AWS_SECRET_ACCESS_KEY=]
      --s3-access-key-id <S3_ACCESS_KEY_ID>  [env: LIBSQL_BOTTOMLESS_AWS_ACCESS_KEY_ID=]
      --s3-bucket <S3_BUCKET>                [env: LIBSQL_BOTTOMLESS_BUCKET=]
      --s3-region-id <S3_REGION_ID>          [env: LIBSQL_BOTTOMLESS_AWS_DEFAULT_REGION=]
  -h, --help                                 Print help

Usage

The compactor is designed to run as a separate process. Namespaces are registered with the compactor with the monitor command. Once a namespace is monitored, you can call sync to retrieve metadata about segments for this namespace stored on the remote storage. Once the compactor is synced, it can start to operate. All operations are performed on cached metadata.

Analyze

Analyze segments for a namespaces

Usage: compactor analyze [OPTIONS] <NAMESPACE>

Arguments:
  <NAMESPACE>

Options:
      --list-all  list all segments
  -h, --help      Print help

The analyze command returns general information about the segments stored for that namespace. The --list-all also list all the segments. e.g:

stats for db-0:
- segment count: 42
- last frame_no: 41449
- shortest restore path len: 1
- oldest segment: 1-1001 (2024-09-04 11:21:36 UTC)
- most recent segment: 1-41449 (2024-09-09 12:34:46 UTC)

The shortest restore path len tells us how many segments (at most) we would need to fetch to restore a db to the most recent state.

Compact

Compact segments into bigger segments

Usage: compactor compact [OPTIONS] --strategy <STRATEGY> <NAMESPACE>

Arguments:
  <NAMESPACE>

Options:
  -s, --strategy <STRATEGY>  [possible values: logarithmic, compact-all]
      --dry-run              prints the compaction plan, but doesn't perform it
  -h, --help                 Print help

The compact subcommand will attempt to compact a set of segments according to a compaction strategy. Compaction's first step is to find the smallest subset of segments that covers the whole frame space. This is the minimal restore path (MRP). The goal of compaction is to reduce the length of the MPR. Different strategies can be used to achieve this. Currently, there are two supported strategies for compaction:

Restore

Restore namespace

Usage: compactor restore [OPTIONS] <NAMESPACE> <OUT>

Arguments:
  <NAMESPACE>
  <OUT>

Options:
      --verify
  -h, --help    Print help

The restore command will restore the database to the given path.

Design

The compactor uses an SQLite database to store metadata about monitored namespaces. The sync call calls the underlying backend to get a list of all segments and store them in the SQLite database.

The analyze call builds a graph of the segments. A shortest path algorithm can be run on this graph to find the MRP.

We introduce a SegmentSet structure that represents a continuous and non-overlapping set of segments. The PartitionStrategy trait expresses how a SegmentSet can be partitioned into multiple SegmentSet.

The compact method will take a SegmentSet and compact them into a single segment and upload that segment using the configured storage Backend