elixir-lang / tree-sitter-elixir

Elixir grammar for tree-sitter
https://elixir-lang.org/tree-sitter-elixir
Apache License 2.0
245 stars 24 forks source link

TS is slow when parsing elixir files. #13

Closed qwexvf closed 2 years ago

qwexvf commented 2 years ago

When i open a elixir file i get TS took 356ms to parse document, will be disabled for current file. It happens with small or larger file.

is this because i'm on a m1 mac?

jonatanklosko commented 2 years ago

I don't think it's a parsing issue, but rather performance of the highlighting queries. For me parsing the whole Elixir repo takes ~2s. So this belongs rather to nvim-treesitter/nvim-treesitter.

@connorlay please correct me if I miss something!

connorlay commented 2 years ago

@jonatanklosko I agree, this is an issue with highlighting performance in nvim-treesitter, not an issue with the parser itself.

jonatanklosko commented 2 years ago

@connorlay Thanks for confirming! I know you already optimised the queries for performance, so not sure how much more we can do. If you have any ideas feel free to open an issue on nvim-treesitter.

connorlay commented 2 years ago

If you have any ideas feel free to open an issue on nvim-treesitter.

It is worth an investigation at some point. I don't know how to profile neovim internals yet, but I'm sure there are improvements to be made.

Related to this topic: compared to other language parsers I've seen, the Elixir parser (both this one and the previous one) produces a fairly abstract tree of nodes. I wonder if it would make sense to introduce new node types for common scenarios, such as function definitions or module attributes? I know this would deviate from how Elixir actually represents these constructs internally (it's functions and macros all the way down 😉 ), but it would greatly simplify the queries used by neovim and other editor integrations.

jonatanklosko commented 2 years ago

Yeah, that's something I was wondering about initially, but in the end we need to parse any valid syntax (so if someone uses quote around code that is not valid Elixir, but is valid syntax, we still need to parse it). So if we had specific tokens in some scenarios it would be hard to draw the line between the approaches, and pretty sure it would make the AST really confusing.

connorlay commented 2 years ago

so if someone uses quote around code that is not valid Elixir, but is valid syntax, we still need to parse it

Ah good point, so the parser cannot make any assumptions about the semantics of Elixir, just the syntax?

jonatanklosko commented 2 years ago

Yup, I also documented that briefly here.

And even in some particular cases where I was tempted to make the AST more specific, I quickly came up with scenarios where it would break. For example the "when" operator in stab clauses could be a more specific guards node, but due to macros it can really be used anywhere and it's hard to tell, like in match?(x when x == 1, 1) (and it would be hard to unify with defs anyway).

connorlay commented 2 years ago

Thanks for the explanation @jonatanklosko !