Closed RuofengX closed 8 months ago
There are two issues with this:
(1) The name cannot contain :
. That's just down to how a partition is referenced in sealed journals, where it looks like:
partition1:max_seqno
partition2:max_seqno
...
Having a :
would break it. That's also why a PartitionKey
is a Arc<str>
and not a Arc<[u8]>
. I don't really feel like supporting space feels right either.
(2) The partition name needs to be filename-safe, because it is stored in the file system as a directory, so on Linux that would disqualify /
and on Windows it would disqualify a whole bunch of characters like ?
, <
, >
, ...
There are two solutions though you could consider:
Keep track of partitions in a manifest table and map your typenames to internal IDs. Keys and values can be any arbitrary byte sequence. Then you can lookup the internal partition ID using your typename, like:
let manifest = keyspace.open_partition("manifest", Default::default())?;
// Register partition for "Dense<K, V, ()>"
manifest.insert("Dense<K, V, ()>", nanoid!())?;
// Get partition
let partition_name = manifest.get("Dense<K, V, ()>")?.expect("should exist");
let partition = keyspace.open_partition(partition_name, Default::default())?;
// Delete partition
manifest.remove("Dense<K, V, ()>")?;
keyspace.delete_partition(partition)?;
// ...
Considering you probably only have a handful of partitions, the manifest is very small, so it easily fits into memory for fast lookups.
For nicer usage, you could wrap it in a newtype, like
use fjall::{Keyspace, PartitionHandle};
#[derive(Clone)]
struct Manifest(PartitionHandle);
impl Manifest {
pub fn new(keyspace: &Keyspace) -> fjall::Result<Self> {
let partition = keyspace.open_partition("manifest", Default::default())?;
Ok(Self(partition))
}
// ... implement get and set for typenames
}
This is something I'm doing in Smoltable
: https://github.com/marvin-j97/smoltable/blob/main/server/src/manifest.rs
Encode your typename into a partition name that is valid. One example I could think of would be:
// Pseudo code
let typename = "Dense<K, V, D>";
let partition_name = base64url(typename.as_bytes()); // or maybe MD5
let partition = keyspace.open_partition(partition_name, Default::default())?;
// ...
The base64url(url safe and no padding) is the stateless solution for these, with some trade off for readability(when debug maybe).
I'll try this amd close the issue, thanks for immediately reply.
motivation
I am seeking a lsm-tree based database as my in-memory database backend. Firstly, I found crate lsm_tree, but it doesn't contain auto flushing and partition, the only thing i can do is to store different table in one folder, that's fine. Then I find this crate, it exactly is the wheel that I am looking for. After a few minutes migration, a big problem occurs.
the problem
I was using a reflection of a type as the name of the lsm_tree file to store different types in different trees. Let's say, the dictionary-like generic api for my storage is
Dense<K, V, D>
, D is for delta. When lib user want to store a type like <u64,u64,()>, the name after reflection of this type is something like Dense<u64, u64, ()> As you can see, the name contains character <>, and blank space, which cannot used as the name of partition.solution
remove the limit for partition name or, use a AsRef<[u8]> as the partition name api.