helix-editor / helix

A post-modern modal text editor.
https://helix-editor.com
Mozilla Public License 2.0
33.28k stars 2.47k forks source link

support combined injections #1378

Closed the-mikedavis closed 2 years ago

the-mikedavis commented 2 years ago

I think we chatted about this on the matrix a while back. I wanted to make an issue just to make sure it doesn't get lost.

There are a few grammars that would benefit from being able to combine the injections. Mostly it's templating languages like heex (#881) or tree-sitter-embedded-template (erb and some js templates), but also now git-diff (#1373) (which I wrote as a line-based grammar so it would work out-of-the-box without combined injections).

What are combined injections?... From the tree-sitter docs: > `injection.combined` - indicates that all of the matching nodes in the tree should have their content parsed as one nested document So you can write a query in a language's `injections.scm` that does a `(#set! injection.combined)`, and all matching nodes will be parsed together. As a practical example, the git-commit grammar would parse a document like this one: ``` foo abc # def ghi ``` like so: ``` (source (subject) (message) (comment) (message)) ``` And if you had an `injections.scm` like so: ``` ((message) @injection.content (#set! injection.combined) (#set! injection.language "comment")) ``` Then the `message` nodes would be parsed as if they were one continuous `message` node spanning multiple lines. This ends up being important for template languages which usually have control flow spanning multiple nodes, like so: ```heex <%= if true do %>

Hello, combined injections!

<% end %> ``` Here the two directive nodes need to be combined for the contained do-end block to be parsed as a pair.

I'm interested in taking a stab at this but I don't really know where to begin. I suspect I probably don't have the rust chops to take this on :P

archseer commented 2 years ago

This stems from a hack in syntax.rs: the code was based on tree-sitter-highlight that did not do incremental tree parsing, so the code was modified so that we incrementally parse and reuse the root layer, but injections are parsed on the fly.

HighlightIterLayer::new() contains code that processes the combined injections: https://github.com/helix-editor/helix/blob/a4641a8613bcbe4ad01d28d3d2a6f4509fef96a9/helix-core/src/syntax.rs#L1111-L1157

But this is sidestepped for the root layer: https://github.com/helix-editor/helix/blob/a4641a8613bcbe4ad01d28d3d2a6f4509fef96a9/helix-core/src/syntax.rs#L492-L511

So combined injections only work on nested grammars.

archseer commented 2 years ago

I'm working on a rewrite that will be able to incrementally update all the layers. This should resolve this issue (and #1151)

archseer commented 2 years ago

(Work is ongoing in https://github.com/helix-editor/helix/tree/incremental)

archseer commented 2 years ago

Merged into master in https://github.com/helix-editor/helix/commit/7c9ebd05b83e90e55d032f65d9405ad265b82258 ! Would you be interested in adding https://github.com/tree-sitter/tree-sitter-embedded-template & eex now?

the-mikedavis commented 2 years ago

(H)eex still has some bugs in my local testing (unrelated to combined injections, needs a fix in the elixir grammar) but embedded-template should be good to go. The only weird thing with it currently has injections for different languages so the one grammar is used for erb and ejs with different injections queries between them https://github.com/tree-sitter/tree-sitter-embedded-template/tree/d21df11b0ecc6fd211dbe11278e92ef67bd17e97/queries

is there a way to specify a config like this in languages.toml? I think jsx and tsx would benefit from something like that too

the-mikedavis commented 2 years ago

I should be able to add https://github.com/elixir-lang/tree-sitter-iex though which is perfect for combined injections

the-mikedavis commented 2 years ago

The combined injections work quite well so I'm gonna close this out. Thanks @archseer!!