Potential optimizations - Githubissues

quarylabs / sqruff

Fast SQL formatter/linter

https://quary.dev/

Apache License 2.0

284 stars 8 forks source link

Potential optimizations #327

Open benfdking opened 3 months ago

benfdking commented 3 months ago

This is just a list of potential optimizations that we think could work.

Maintained list of optimizations

[x] We currently use random numbers (uuid) for our caching system, my understanding is that generating random numbers could be expensive (at the very least it's 128-bit), could you
- 1. move to 64-bit integers
- 1. not use random. but a counter, so it is more deterministic and skips expensive random
[ ] Can we const fn more of the work upfront for building of dialects, (we could potentially use Perfect Hashes)
- Could have an improvement on what is packaged
- Could have an improvement impact on startup
[x] Are there any bits where we still use String and .clone() for convenience?
[x] Caching Depmath
[x] Caching dependent type set
- [ ] We know we do these lazily/optionally on call; is it worth introducing that branching?
[ ] A surprising amount of time is spent in DepthMap from_config. Could we "parse" the config into a super clean type first and then make the downstream "re-parsing" of every time from a slightly more flexible structure more efficient?
[x] Will building with --features=jemalloc improve performance?

@gvozdvmozgu feel free to add any comments that you have thought of

gvozdvmozgu commented 3 months ago

It seems that creating a DepthMap is currently quite expensive and is not cached, meaning different rules generate their own DepthMap for the same segments.

gvozdvmozgu commented 3 months ago

Using const fn is unlikely at the moment. We need at least compile-time HashMaps. There is an attempt to implement this in this PR, but it requires unstable features, and I’m not sure how well it actually works. The PR appears to be abandoned.

benfdking commented 3 months ago

I was also looking at some of the flame graphs, and to me, there may be two potential areas of improvement:

A descendant_type_set is a function where a lot of time is spent. Is there a way to speed it up descendant_type_set? We use AHashSet but since we are acting with very small sets, I wonder if just doing the comparison could be faster?
A surprising amount of time is spent in DepthMap from_config, could we "parse" the config into a super clean type first and then make the downstream, "re-parsing" of every time from a slightly more flexible structure more efficient.

gvozdvmozgu commented 3 months ago

Will building with --features=jemalloc improve performance?

gvozdvmozgu commented 3 months ago

the descendant_type_set should ideally be cached, at least that's how it's done in the original implementation

gvozdvmozgu commented 3 months ago

We use .to_uppercase (which allocates a String) in some places for case-insensitive comparison. Instead, we could use something like UniCase<Cow<'static, str>>. It appears we already use UniCase in some parts of the code. And accordingly, change the signature of Matchable::simple to Option<(AHashSet<UniCase<Cow<'static, str>>, AHashSet<&'static str>)> for example.

benfdking commented 3 months ago

Just thought I'd upload a latest flamegraph.

flamegraph