Open benfdking opened 3 months ago
It seems that creating a DepthMap is currently quite expensive and is not cached, meaning different rules generate their own DepthMap for the same segments.
Using const fn is unlikely at the moment. We need at least compile-time HashMaps. There is an attempt to implement this in this PR, but it requires unstable features, and I’m not sure how well it actually works. The PR appears to be abandoned.
I was also looking at some of the flame graphs, and to me, there may be two potential areas of improvement:
descendant_type_set
is a function where a lot of time is spent. Is there a way to speed it up descendant_type_set
? We use AHashSet
but since we are acting with very small sets, I wonder if just doing the comparison could be faster?DepthMap
from_config
, could we "parse" the config into a super clean type first and then make the downstream, "re-parsing" of every time from a slightly more flexible structure more efficient. Will building with --features=jemalloc improve performance?
the descendant_type_set should ideally be cached, at least that's how it's done in the original implementation
We use .to_uppercase (which allocates a String) in some places for case-insensitive comparison. Instead, we could use something like UniCase<Cow<'static, str>>
. It appears we already use UniCase
in some parts of the code. And accordingly, change the signature of Matchable::simple
to Option<(AHashSet<UniCase<Cow<'static, str>>
, AHashSet<&'static str>)>
for example.
Just thought I'd upload a latest flamegraph.
This is just a list of potential optimizations that we think could work.
Maintained list of optimizations
const fn
more of the work upfront for building of dialects, (we could potentially use Perfect Hashes)@gvozdvmozgu feel free to add any comments that you have thought of