scylladb / scylla-rust-driver

Async CQL driver for Rust, optimized for ScyllaDB!
Apache License 2.0
583 stars 103 forks source link

Optimize CachinSession #477

Open wyfo opened 2 years ago

wyfo commented 2 years ago

CachingSession is currently suboptimal, especially when using it &'static str (and I suppose it's the main use case), mostly for two reasons:

I've implemented an optimization to solve this issue, in which I've made the following modifications:

I've done some benchmarks (using my old i7 4650U MacBook Air), in which I've measured only one prepared statement retrieval (no serialization and database access) from &'static str. Here are the results :

query string size current optimized
20 390 ns 205 ns
100 450 ns 250 ns

I let you judge if the optimization could be worth it. The implementation is just a draft and maybe adapted (I can open a PR).

wyfo commented 2 years ago

Kind of duplicate of #475, but implementations are quite different

colin-grapl commented 2 years ago

Hey, I was about to open an issue around the fact that &'static str is the default case in most instances, or at least it should be (since if you're building queries with format strings you're potentially going to have injections), and therefor Query should accept a &str. This is particularly nice if you're calling the same query many times but with different parameters.

I also agree that the cache should hold Arc<PreparedStatement> to make it cheaper to clone.

I have a couple of notes on your code: https://github.com/wyfo/scylla-rust-driver/blob/optimize_caching_session/scylla/src/transport/session.rs#L595

prepare is a bit unfortunate in that it takes impl Into<Query> but never actually needs ownership of Query. That's because Connection::send_request borrows the request. So at no point is an owned buffer actually necessary at all.

The main benefit of taking impl Into<Query> is that it lets you provide static strings, but it means that every called to impl Into<Query> is forcing an allocation. That's still happening with the approach you've taken.

This is why I went down the route of making impl Into<Query> cheap. Unfortunately, I've lost that code. Whoops.

I'd like to hear from the Scylla devs on this, if there's any interest or not. I think there are a few things to do here:

  1. Moving PreparedStatement into Arc should likely be its own PR, it should be a very straightforward win. There is a clone of PreparedStatement for every add_prepared_statement call, whether it was previously prepared or not. PreparedStatement is a very large structure that itself has many pointers, including Arc<dyn _> pointers, so this should just strictly be a win.

  2. Determining what amount of API breakage, if any, is acceptable. In the approach I took I had a Cow<'a, String> in order to minimize API breakage. I think optimizing for the &'static or &'a cases makes the most sense for two reasons.

    • Most queries should be static strings, or otherwise references
    • At no point does Scylla's driver require owning the bytes. This makes sense when you consider that the query is going to be compressed into a separate buffer, meaning it never needs to manage the lifetime internally

The changes to to avoid those clones and copies would be:

I took a quick pass at redoing the work. https://github.com/scylladb/scylla-rust-driver/compare/main...colin-grapl:scylla-rust-driver:remove-allocations

This is a breaking API change. In particular, if you were calling .into() on queries before passing them into execute, you have to not do that. I have also added lifetimes to a number of structs.