cberner / redb

An embedded key-value database in pure Rust
https://www.redb.org
Apache License 2.0
3.07k stars 137 forks source link

Sketch of a declarative query library for redb #793

Closed casey closed 1 week ago

casey commented 3 months ago

I was inspired by axum to try to write a library for easily writing redb queries, called redbql. The "ql" is for query library, since it's not really a language.

It introduces a few new traits. Query is a function that runs against a read transaction. Statement is a function that runs against a write transaction.

StatementArg is something that can produce a value from a write transaction, and QueryArg is something that can produce a value from a read transaction.

You can write a Query or Statement implementation manually, but the nice thing is that it can be auto-implemented for functions which take types which implement QueryArg / StatementArg:

impl<'a, F, O, E, T0> Query<'a, (T0,)> for F
where
  F: FnOnce(T0) -> Result<O, E>,
  T0: QueryArg<'a>,
  E: From<redb::Error>,
{
  type Output = O;
  type Error = E;

  fn run(self, tx: &'a ReadTransaction) -> Result<Self::Output, Self::Error> {
    let t0 = T0::from_tx(tx)?;
    self(t0)
  }
}

impl<'a, F, O, E, T0, T1> Query<'a, (T0, T1)> for F
where
  F: FnOnce(T0, T1) -> Result<O, E>,
  T0: QueryArg<'a>,
  T1: QueryArg<'a>,
  E: From<redb::Error>,
{
  type Output = O;
  type Error = E;

  fn run(self, tx: &'a ReadTransaction) -> Result<Self::Output, Self::Error> {
    let t0 = T0::from_tx(tx)?;
    let t1 = T1::from_tx(tx)?;
    self(t0, t1)
  }
}

And a boilerplate macro for declaring tables with corresponding QueryArg and StatrementArg types:

#[macro_export]
macro_rules! table {
  ($ro:ident, $rw:ident, $name:ident, $key:ty, $value:ty) => {
    struct $rw<'a>(::redb::Table<'a, $key, $value>);

    const $name: ::redb::TableDefinition<'static, $key, $value> =
      ::redb::TableDefinition::new(stringify!($name));

    impl<'a> StatementArg<'a> for $rw<'a> {
      fn from_tx(tx: &'a ::redb::WriteTransaction) -> Result<Self, ::redb::Error> {
        Ok(Self(tx.open_table($name)?))
      }
    }

    struct $ro(::redb::ReadOnlyTable<$key, $value>);

    impl<'a> QueryArg<'a> for $ro {
      fn from_tx(tx: &'a ::redb::ReadTransaction) -> Result<Self, ::redb::Error> {
        Ok(Self(tx.open_table($name)?))
      }
    }
  };
}

And finally, what this buys is, is that we can write functions that implement our queries, and the tables are opened for us:

fn initialize(mut names: NamesMut) -> Result<(), redb::Error> {
  names.0.insert("james", "smith")?;
  Ok(())
}

fn get(names: Names) -> Result<Option<String>, redb::Error> {
  Ok(names.0.get("james")?.map(|guard| guard.value().into()))
}

let dir = tempfile::TempDir::new().unwrap();

let database = Database::create(dir.path().join("database.redb")).unwrap();

{
  let tx = database.begin_write().unwrap();

  initialize.execute(&tx).unwrap();

  tx.commit().unwrap();
}

{
  let tx = database.begin_read().unwrap();

  let result = get.run(&tx).unwrap();

  assert_eq!(result, Some("smith".into()));
}

What do you think? I just finished the implementation, and want to play with it a bit more to see how it works inside of ord. I think this gets very close to the convenience of SQL statements, where you just use the tables you want, and the database takes care of opening them for you.

I also wanted to write a function on Database which would take a query an execute it without needing to start a transaction, but I ran into lifetime errors and gave up T_T I'm sure someone clever can come up with a working version tho.

casey commented 3 months ago

Oh yeah, in particular I'm curious about efficiency. A downside of this approach is that the database has to open and close tables for each query, instead of reusing them.

casey commented 3 months ago

One cool thing would be if you could express table definitions fully in the type system, i.e., by using the table name as a const generic, in which case you wouldn't need wrappers, but const generics don't support strings. I think you could do it though if table names were integers.

cberner commented 3 months ago

Oh yeah, in particular I'm curious about efficiency. A downside of this approach is that the database has to open and close tables for each query, instead of reusing them.

I haven't benchmarked it, but my guess is this is fine. Opening and closing tables should be cheap, so as long as the queries are non-trivial it's probably fine

cberner commented 3 months ago

I haven't looked at axum, but this definitely seems like an interesting idea! When I first started redb, I was actually thinking it would be a little more like sqlite than lmdb, but then I quickly gave up on that idea.

One thing that seems a bit off about the proposed Query trait is that run() only takes the transaction as an argument. How would I pass in a value other than "james"? I was expecting it to look a little more like a prepared statement in SQL, where you write the function that executes operations on a Table(s), and then you pass arguments into that function to execute it

casey commented 3 months ago

I added an example in the tests using a closure, which allows you to pass in outside arguments. I also added an example using a struct and implementing query manually, which is pretty gross but probably becomes okay once you have enough things that the boilerplate of opening tables is greater than the boilerplate of the trait implementation.

cberner commented 3 months ago

Ah yes, I see how it works. Are you using it in ord? I'd be curious to see whether the code ends up being cleaner with this approach, in practice

casey commented 3 months ago

I'm not, but I'll try it out and see how it works in practice.

cberner commented 1 week ago

Feel free to re-open if you do further work on this