fjall-rs / fjall

LSM-based embeddable key-value storage engine written in safe Rust
https://fjall-rs.github.io/
Apache License 2.0
323 stars 13 forks source link

Allow more characters in partition name. #29

Closed RuofengX closed 8 months ago

RuofengX commented 8 months ago

motivation

I am seeking a lsm-tree based database as my in-memory database backend. Firstly, I found crate lsm_tree, but it doesn't contain auto flushing and partition, the only thing i can do is to store different table in one folder, that's fine. Then I find this crate, it exactly is the wheel that I am looking for. After a few minutes migration, a big problem occurs.

the problem

I was using a reflection of a type as the name of the lsm_tree file to store different types in different trees. Let's say, the dictionary-like generic api for my storage is Dense<K, V, D>, D is for delta. When lib user want to store a type like <u64,u64,()>, the name after reflection of this type is something like Dense<u64, u64, ()> As you can see, the name contains character <>, and blank space, which cannot used as the name of partition.

solution

remove the limit for partition name or, use a AsRef<[u8]> as the partition name api.

marvin-j97 commented 8 months ago

There are two issues with this:

(1) The name cannot contain :. That's just down to how a partition is referenced in sealed journals, where it looks like:

partition1:max_seqno
partition2:max_seqno

...

Having a : would break it. That's also why a PartitionKey is a Arc<str> and not a Arc<[u8]>. I don't really feel like supporting space feels right either.

(2) The partition name needs to be filename-safe, because it is stored in the file system as a directory, so on Linux that would disqualify / and on Windows it would disqualify a whole bunch of characters like ?, <, >, ...

There are two solutions though you could consider:

Manifest table

Keep track of partitions in a manifest table and map your typenames to internal IDs. Keys and values can be any arbitrary byte sequence. Then you can lookup the internal partition ID using your typename, like:

let manifest = keyspace.open_partition("manifest", Default::default())?;

// Register partition for "Dense<K, V, ()>"
manifest.insert("Dense<K, V, ()>", nanoid!())?;

// Get partition
let partition_name = manifest.get("Dense<K, V, ()>")?.expect("should exist");
let partition = keyspace.open_partition(partition_name, Default::default())?;

// Delete partition
manifest.remove("Dense<K, V, ()>")?;
keyspace.delete_partition(partition)?;

// ...

Considering you probably only have a handful of partitions, the manifest is very small, so it easily fits into memory for fast lookups.

For nicer usage, you could wrap it in a newtype, like

use fjall::{Keyspace, PartitionHandle};

#[derive(Clone)]
struct Manifest(PartitionHandle);

impl Manifest {
  pub fn new(keyspace: &Keyspace) -> fjall::Result<Self> {
    let partition = keyspace.open_partition("manifest", Default::default())?;
    Ok(Self(partition))
  }

   // ... implement get and set for typenames
}

This is something I'm doing in Smoltable: https://github.com/marvin-j97/smoltable/blob/main/server/src/manifest.rs

Encoding

Encode your typename into a partition name that is valid. One example I could think of would be:

// Pseudo code

let typename = "Dense<K, V, D>";
let partition_name = base64url(typename.as_bytes()); // or maybe MD5
let partition = keyspace.open_partition(partition_name, Default::default())?;

// ...
RuofengX commented 8 months ago

The base64url(url safe and no padding) is the stateless solution for these, with some trade off for readability(when debug maybe).

I'll try this amd close the issue, thanks for immediately reply.