BioJulia / Automa.jl

A julia code generator for regular expressions
Other
188 stars 15 forks source link

Slow compilation speed #111

Closed jakobnissen closed 1 year ago

jakobnissen commented 2 years ago

Automa is a compiler that compiles regular expressions into Julia code. While it produces fast code, the compiler itself (i.e. Automa) is slow. Having worked on the code, I can tell you that it's absolutely not optimised for compilation speed. I'm not sure how big of a deal this is. Presumably all the heavy lifting is done in the precompilation stage. Nonetheless, it would be nice to profile codegen from a complex regex to see just where the bottlenecks are.

jakobnissen commented 1 year ago

A quick profile of compiling a moderately complex regex to a machine:

Codegen is more than 10x faster for the same machine, even when using the goto-generator.

That's 97% of total run time. So, optimisation should focus on reduce_nodes and nfa2dfa.

jakobnissen commented 1 year ago

It turns out the large majority was latency. Precompiling made Automa more than 10x faster I'll close this with v1 and reopen if we need more runtime perf in the future