Open nhenezi opened 2 years ago
Hey @nhenezi, thanks for digging so deep into it!
I'm suspecting that the
.as_params()
is the bottleneck, as results gets worse the more columns I add to the performance test and.as_params()
has to iterate over all values and convert them to something that SQLite understands.
You are probably right. The conversion comes with overhead. Also, I suspect wrapper struct of RusqliteValues
& RusqliteValue
could cause overhead as well.
Btw... are you building something that requires high performance?
For future visitors to this issue - approach I took was that sea_query
is used to define all metainformation (table, columns, constraints, etc.). We use sea_query::Query
to construct the actual query, but supply only one row to .valus_panic
so that we have a properly formed statement. We can then build that using SqliteQueryBuilder
to get a prepared statement that can be used normally with rusqlite
and plug in existing rusqlite::Values
into that.
That removes overhead of using sea_query::Value
, but provides a drawback that you cannot so easily swap out a different db., but achieves performances that are comparable to rusqlite
; e.g. raw stringlike SQL. For our use case, that's an acceptable tradeoff due to how nicer table definitions look like and the ability to use query builders. There shouldn't be any problems on the retrieval side as conversions like this shouldn't be a bottleneck.
Yeah, high performance is a requirement. We have built a system that achieves that using raw SQLite, because at the time of writing there was nothing that could support the dynamic nature of data that we have to deal with. sea_query
is exactly what we have been looking for and are rewriting now in sea_query
to get rid of raw SQL. Quite enjoyable experience so far and hopefully soon I'll have some time to tackle features we are missing from sea_query
(views, FTS). Thank you for tackling dynamic query building in rust!
Thanks for the adoption! Looking forward to your PR hahaa :)
Just a sanity check, are you using Rusqlite and testing with --release
?
There is some basic benchmark in https://github.com/SeaQL/sea-query/tree/master/benches may be you can add more benchmarks and use that to dissect the overhead. Although if you wanted to have zero runtime overhead, you should consider using sea-query at compile-time (inside your own macro crate) and split out a static string and have it only prepared once.
I strongly believe that the as_params
is a pure type system exercise that LLVM would be able to optimize away. If that's not the case, then we could probably use ParamsFromIter
so that the Vec allocation does not actually occur
I'm trying to get
sea_query
to be as performant as native SQLite prepared statement as possible. I've only testedTEXT
columns and here's the fastest I've managed to do it:by using the fact that I know that all values have the same structure so I use only the first one to construct a prepared statement that is utilized during the insertion - that seemed to produce the equivalent to the fastest SQLite insertion and consequentially fastest
sea_query
insertions:It's still 25% slower which I would like to bring down to 5-10% if possible. A few different approaches and a more extensive set of experiments you can find at https://gitlab.com/nhenezi/rust-sea-query-perf-test/-/blob/master/src/main.rs .
I'm suspecting that the
.as_params()
is the bottleneck, as results gets worse the more columns I add to the performance test and.as_params()
has to iterate over all values and convert them to something that SQLite understands.I see two approaches that I can take here, but neither one is doable in my case:
.as_params()
but rather in a parallelized manner convert everyValue
individually to its SQLite representationDo you see any other way to increase performance here? Am I missing something that could shave off some time off?