elixir-lang / tree-sitter-elixir

Elixir grammar for tree-sitter
https://elixir-lang.org/tree-sitter-elixir
Apache License 2.0
245 stars 24 forks source link

Improve highlighting in basic IEx snippets #17

Closed jonatanklosko closed 2 years ago

jonatanklosko commented 2 years ago

Code with IEx snippets is usually not a valid Elixir syntax (as soon as we introduce blocks), however tree-sitter parses most of code just fine. Consequently, with a few rules we can significantly improve highlighting.

IEx snippets are mostly used in Markdown like this:

iex(1)> if true do
...(1)>   1
...(1)> end

Before

image

image

image

After

image

image

image

As seen on the last snippet, more complex (especially nested) code still has glitches, but that's expected given the syntax is invalid.

Currently the syntaxes above are invalid because end is parsed as identifier and the actual block end is missing. Once https://github.com/tree-sitter/tree-sitter/pull/246 is in place, we would be able to make sure reserved keywords are never parsed as identifiers, which would most likely make the blocks parse fine (the parsing error would be in a different node), and consequently the highlighting would also work for nested blocks.


@patrickt it would be neat if there was a token type that gets CSS user-select: none;, so that the code can be copied without the prefixes, like here. Ruby has similar prefixes, so maybe it could be used there too. That said, this may be out of the scope given it is based on invalid syntax in the first place, but just throwing an idea :)

cc @maxbrunsfeld in case you have any thoughts regarding this use case

the-mikedavis commented 2 years ago

I wonder if it makes sense to do this with injections instead? I.e. having a separate tree-sitter-iex grammar that does a combined-injection of all the code on the right-hand-side of the > as elixir?

I imagine using injections would make it a little less glitchy when there are big nested blocks as you say, but it already seems to work pretty well so idk if it's worth the overhead of maintaining separate grammars

jonatanklosko commented 2 years ago

As far as I understand the injection would need a single specific node, while the iex>|...> prefix nodes are interspersed with the elixir code. So we would have a number of nodes with Elixir fragments, but those fragments would only make sense parsed together. Let me know if I'm missing some other way of doing this :)

the-mikedavis commented 2 years ago

I think you can use combined injections to parse all nodes together

injection.combined - indicates that all of the matching nodes in the tree should have their content parsed as one nested document.

(from the docs here)

This was something I was attempting to use with the heex grammar (see https://github.com/helix-editor/helix/pull/881, although the editor does not yet support combined injections), so if you had a snippet like

<main class="container">
  <%= if true do %>
    <p>Hello, tree-sitter!</p>
  <% end %>
</main>

you could parse the nodes on lines 2 and 4 together. I think this also works with erb templates (for example the do and end pair here: https://github.com/joruri/zplugin4-content-docin/blob/18154c650033688d423391dc50a68d9f8c621f77/app/views/zplugin/content/docin/admin/front/index.html.erb#L13-L20)

edit: that heex pr has the combined injection query commented out here: https://github.com/helix-editor/helix/pull/881/files#diff-dfd9ae7d40dd28d7f3fe8d34cb496744db11e59a89c4a1a51b1e4e1e48d58f32

jonatanklosko commented 2 years ago

Ohh awesome, that's totally how embeddings work too. Sounds like the right solution then! @josevalim so I was wrong, we can actually do it :D

jonatanklosko commented 2 years ago

One thing to note is that we will need to annotate markdown snippets with iex and not ex/elixir. This is a good thing, except that existing snippets won't automatically get the updated highlighting.

josevalim commented 2 years ago

I think that’s fine. I believe makeup for Elixir handles both elixir and iex right?

jonatanklosko commented 2 years ago

@josevalim correct!

the-mikedavis commented 2 years ago

I tried out the combined-injections approach and it works pretty well!

And the grammar comes out to be super small, only 28 lines for now: https://github.com/the-mikedavis/tree-sitter-iex/blob/6fef34bc1d530de496bb527876bcc8d5ed997aab/grammar.js

The highlights above are in the helix editor but the highlights and injections queries in the repo work with tree-sitter highlight too (and so I would assume with GitHub)

jonatanklosko commented 2 years ago

@the-mikedavis thanks for pushing this forward! Closing in favour of the new grammar :D

the-mikedavis commented 2 years ago

@jonatanklosko should tree-sitter-iex be moved under elixir-lang? I think it would make sense to have tree-sitter-{elixir,iex,eex} all under elixir-lang and tree-sitter-{heex,surface} under phoenixframework, but maybe there should be an elixir-trees organization instead since there are so many? sorry I think I'm rehashing some of the discussion in #2

jonatanklosko commented 2 years ago

@the-mikedavis moving sounds good to me! Either grouping works, I'll defer to @josevalim :)

josevalim commented 2 years ago

Maybe it should be part of this repository instead?

jonatanklosko commented 2 years ago

I'd keep the repositories separate. This aligns with what tree-sitter does and I think editor integrations point to GitHub repos and they may not necessarily support sub-directories (I may be wrong here).

the-mikedavis commented 2 years ago

I've seen some other tree-sitters that have multiple grammars in one repo (like the ocaml grammar, see ocaml/ and interface/), so that's an option, but I think it's usually only done when the grammar.js files depend on one another and/or the grammars want to share highlight/injection queries

Plus as @jonatanklosko says, it's less straightforward to integrate with editors (although I think possible, like ocaml in helix)

josevalim commented 2 years ago

Alright, let's move it in then. Can you please add me as admin to the project? Then I think can move it. :) Or try to transfer it to me or to the org. Thanks!

the-mikedavis commented 2 years ago

hmm I'm not actually sure how to elevate you to admin? I know where the buttons are for organization stuff but it seems to be different for personal repositories :thinking:

the-mikedavis commented 2 years ago

I think I can transfer ownership to you and then you can transfer it to elixir-lang? Shall we try that?

josevalim commented 2 years ago

@the-mikedavis let's try that, yes! and thank you!

the-mikedavis commented 2 years ago

ok it's sent! thanks :rocket:

josevalim commented 2 years ago

@the-mikedavis merged and I have invited you to the tree-sitter team! :heart: :100: However, if you prefer to not be involved, let me know or feel free to remove yourself. Thank you!

the-mikedavis commented 2 years ago

thanks @josevalim, I'm honored, and I'd love to be involved!