Closed 414owen closed 2 years ago
$ ./script/parse-example effects |& head -n 100
Successfully parsed 100.00% of 'effects' in 0.09s (38 of 38 files)
Original time: 1.09s
1.09 / 0.09 = 12.11
Nice work on this.
Having removed the use of closures, would it be straightforward to get this compiling without C++14 features? That would make this parser a bit easier to consume in downstream applications, and develop on a Mac.
awesome!!
Having removed the use of closures, would it be straightforward to get this compiling without C++14 features?
Yes I plan to remove the need for c++14, and maybe even convert the scanner to pure C :)
Oh, that’s interesting. I’d think that C++ would be useful just for std::vector. are you thinking that a dynamically sized array isn’t needed in the scanner?
are you thinking that a dynamically sized array isn’t needed in the scanner?
I think we will need one, but I'm happy to implement my own if it means we can compile this as C.
While we have you here, would we be able to get your opinion on https://github.com/tree-sitter/tree-sitter-haskell/issues/41#issuecomment-1003353274? We think the scanner's probably running over the whole file (or a lot of it) every edit, but we're not sure if it should.
@414owen Amazing work! How far are you from hitting C++03? Once you do, I can check if the wasm can build successfully via the standard pathways now, and perhaps contribute some tests for that from #55.
@414owen Amazing work! How far are you from hitting C++03?
Thanks :smile: . I'm actually pretty close to removing the last closure. I've removed the Parser
type completely, now I just have to give the same treatment to Condition
s and Peek
s, which are way easier.
I'd say I'll be done later today.
@wenkokke I've removed all the closures. It now compiles with -std=c++11
but not -std=c++03
.
I'll see what I can do.
$ ./script/parse-example effects |& head -n 100
Successfully parsed 100.00% of 'effects' in 0.0297s (38 of 38 files)
Original time: 1.09s
1.09 / 0.0297 = 36.7
There might be one or two micro-optimizations left (PEEKing before taking some code paths?) As well as maybe decreasing the necessary re-computations (see https://github.com/tree-sitter/tree-sitter-haskell/issues/41#issuecomment-950424044)
But this will in any case let us edit 36x larger files than before :tada:
@wenkokke I've added c++03 support here: https://github.com/414owen/tree-sitter-haskell/pull/1
With your current faster-scanner branch, you've got the patch to tree-sitter down to:
--- a/lib/binding_web/exports.json
+++ b/lib/binding_web/exports.json
@@ -3,6 +3,9 @@
"_free",
"_malloc",
+ "__ZNSt3__25ctypeIcE2idE",
+ "__ZNSt3__212basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE25__init_copy_ctor_externalEPKcm",
+
"__ZNKSt3__212basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE4copyEPcmm",
"__ZNKSt3__220__vector_base_commonILb1EE20__throw_length_errorEv",
"__ZNSt3__212basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE6__initEPKcm",
See https://github.com/wenkokke/tree-sitter-haskell/tree/tree-sitter-haskell-wasm-faster-scanner.
@tek This is ready for review, when you have a chance.
All tests pass. Any further changes will probably be cosmetic (moving things around, rather than changing what they do).
@tek I have an updated version of #55 for when this pull request is merged
awesome!
@luc-tielen I appreciate the comments, but I think a lot of them are out of scope, either because I haven't changed the code from what it was doing before, or because they're microoptimizations that the compiler may or may not already be doing.
I'd be interested to see if there's a performance win by implementing your constexpr
hints. Would you be willing to open a PR for them? Same goes with two peeks here
comments. Not sure if they're already optimized out, but I'd love to see some numbers :)
@414owen That's ok. Your PR was fine, so I could only point out the nitpicky stuff.
@414owen it appears to get stuck in an infinite loop when parsing HLS for me, can you confirm?
@414owen it appears to get stuck in an infinite loop when parsing HLS for me, can you confirm?
I'm building helix atm. I'll run it in gdb and see where :)
Okay I think I need to check for an eof
in inline-comment
. I'll PR in a few mins.
old | new | speedup | |
---|---|---|---|
effects | 1.07s | 0.034s | 31.5x |
postgrest | stackoverflow | 0.083s | ∞x |
ivory | 10.16s | 0.225s | 45.16x |
polysemy | 3.71s | 0.089s | 41.7x |
semantic | 50.62s | 1.089s | 46.5x |
haskell-language-server | 24.28s | 0.504s | 48.2x |
crazy
See part 1 for more info.
Overall speedup: around 45x.