Closed jonatanklosko closed 2 years ago
I wonder if it makes sense to do this with injections instead? I.e. having a separate tree-sitter-iex
grammar that does a combined-injection of all the code on the right-hand-side of the >
as elixir?
I imagine using injections would make it a little less glitchy when there are big nested blocks as you say, but it already seems to work pretty well so idk if it's worth the overhead of maintaining separate grammars
As far as I understand the injection would need a single specific node, while the iex>
|...>
prefix nodes are interspersed with the elixir code. So we would have a number of nodes with Elixir fragments, but those fragments would only make sense parsed together. Let me know if I'm missing some other way of doing this :)
I think you can use combined injections to parse all nodes together
injection.combined
- indicates that all of the matching nodes in the tree should have their content parsed as one nested document.
(from the docs here)
This was something I was attempting to use with the heex grammar (see https://github.com/helix-editor/helix/pull/881, although the editor does not yet support combined injections), so if you had a snippet like
<main class="container">
<%= if true do %>
<p>Hello, tree-sitter!</p>
<% end %>
</main>
you could parse the nodes on lines 2 and 4 together. I think this also works with erb templates (for example the do
and end
pair here: https://github.com/joruri/zplugin4-content-docin/blob/18154c650033688d423391dc50a68d9f8c621f77/app/views/zplugin/content/docin/admin/front/index.html.erb#L13-L20)
edit: that heex pr has the combined injection query commented out here: https://github.com/helix-editor/helix/pull/881/files#diff-dfd9ae7d40dd28d7f3fe8d34cb496744db11e59a89c4a1a51b1e4e1e48d58f32
Ohh awesome, that's totally how embeddings work too. Sounds like the right solution then! @josevalim so I was wrong, we can actually do it :D
One thing to note is that we will need to annotate markdown snippets with iex
and not ex
/elixir
. This is a good thing, except that existing snippets won't automatically get the updated highlighting.
I think that’s fine. I believe makeup for Elixir handles both elixir
and iex
right?
@josevalim correct!
I tried out the combined-injections approach and it works pretty well!
And the grammar comes out to be super small, only 28 lines for now: https://github.com/the-mikedavis/tree-sitter-iex/blob/6fef34bc1d530de496bb527876bcc8d5ed997aab/grammar.js
The highlights above are in the helix editor but the highlights and injections queries in the repo work with tree-sitter highlight
too (and so I would assume with GitHub)
@the-mikedavis thanks for pushing this forward! Closing in favour of the new grammar :D
@jonatanklosko should tree-sitter-iex be moved under elixir-lang? I think it would make sense to have tree-sitter-{elixir,iex,eex} all under elixir-lang and tree-sitter-{heex,surface} under phoenixframework, but maybe there should be an elixir-trees organization instead since there are so many? sorry I think I'm rehashing some of the discussion in #2
@the-mikedavis moving sounds good to me! Either grouping works, I'll defer to @josevalim :)
Maybe it should be part of this repository instead?
I'd keep the repositories separate. This aligns with what tree-sitter does and I think editor integrations point to GitHub repos and they may not necessarily support sub-directories (I may be wrong here).
I've seen some other tree-sitters that have multiple grammars in one repo (like the ocaml grammar, see ocaml/ and interface/), so that's an option, but I think it's usually only done when the grammar.js files depend on one another and/or the grammars want to share highlight/injection queries
Plus as @jonatanklosko says, it's less straightforward to integrate with editors (although I think possible, like ocaml in helix)
Alright, let's move it in then. Can you please add me as admin to the project? Then I think can move it. :) Or try to transfer it to me or to the org. Thanks!
hmm I'm not actually sure how to elevate you to admin? I know where the buttons are for organization stuff but it seems to be different for personal repositories :thinking:
I think I can transfer ownership to you and then you can transfer it to elixir-lang? Shall we try that?
@the-mikedavis let's try that, yes! and thank you!
ok it's sent! thanks :rocket:
@the-mikedavis merged and I have invited you to the tree-sitter team! :heart: :100: However, if you prefer to not be involved, let me know or feel free to remove yourself. Thank you!
thanks @josevalim, I'm honored, and I'd love to be involved!
Code with IEx snippets is usually not a valid Elixir syntax (as soon as we introduce blocks), however tree-sitter parses most of code just fine. Consequently, with a few rules we can significantly improve highlighting.
IEx snippets are mostly used in Markdown like this:
Before
After
As seen on the last snippet, more complex (especially nested) code still has glitches, but that's expected given the syntax is invalid.
Currently the syntaxes above are invalid because
end
is parsed as identifier and the actual block end is missing. Once https://github.com/tree-sitter/tree-sitter/pull/246 is in place, we would be able to make sure reserved keywords are never parsed as identifiers, which would most likely make the blocks parse fine (the parsing error would be in a different node), and consequently the highlighting would also work for nested blocks.@patrickt it would be neat if there was a token type that gets CSS
user-select: none;
, so that the code can be copied without the prefixes, like here. Ruby has similar prefixes, so maybe it could be used there too. That said, this may be out of the scope given it is based on invalid syntax in the first place, but just throwing an idea :)cc @maxbrunsfeld in case you have any thoughts regarding this use case