Open AlbertMarashi opened 2 months ago
I'll keep typing my way through this until I figure out a better idea, because I don't even like any of the options I proposed. Option 1 has the best chance of shipping soon, but it couples things that shouldn't be coupled.
Things that would ideally be highlighted by the editor's separate TS/JS Tree-sitter parsers:
SCRIPT
tagsThings that would be highlighted by coupled, special-purpose TS/JS rules pulled into tree-sitter-svelte
:
<script generic="T extends string[]">
{#snippet foo({ bar, baz })}
#if
, :else if
, etc.Things that are in between, but are almost certainly better off being highlighted by the editor’s separate TS/JS parsers:
@render
, @each
, @await
The painful part here is:
tree-sitter-javascript
and tree-sitter-typescript
that may not agree with what the editor has installed…Let's take the snippet example:
{#snippet foo({ bar, baz })}
{/snippet}
How much of this should be owned by a JavaScript/TypeScript injection?
One option is foo({ bar, baz })
. In isolation, this gets interpreted as a function call with two shorthand properties. But that's wrong; what we want is to treat it like the foo({ bar, baz })
in…
function foo({ bar, baz }) { /* ... */ }
…in almost every way, since it's equivalent to a function definition. We want foo
to be recognized as a major document symbol and bar
and baz
to be treated like function parameter definitions. But there's no good way to do this with injections — unless you adopt option 2 from the ticket description, but that's still a major headache.
Another option is what this grammar currently does: it parses as much as can be inferred by tree-sitter-svelte
without it having to become a parser of arbitrary JS. It parses the foo
, "("
, and ")"
as their own nodes, and leaves only the { bar, baz }
as svelte_raw_text
to be parsed by the injection. (That example would be trivial to parse on its own, but it's better to leave it to a JS parser, since each parameter can have an arbitrary default value.)
This results in worse injection parsing. That's because the bare code { foo, bar }
in JS is not treated like an object literal — top-level braces are interpreted as statement blocks. JS takes this to mean that you are referencing two variables and doing nothing with them. If these parameters have default values, they get highlighted a bit better, but still not quite accurately.
The upside is that more of the structure of the snippet can be seen by tree-sitter-svelte
. I went with this option because it's the only practical way to get snippet names to be recognized as document symbols. I don't think this is implemented in Zed yet, but in tags.scm
this is expressed as…
(snippet_statement
(snippet_start
(snippet_name) @name)
) @definition.snippet
…and allows foo
to show up in a symbol list in Pulsar:
But you're not obligated to inject only into svelte_raw_text
! A third option is to keep the structure of option 2, but write the injection in such a way that it also includes the (
and )
adjacent to the svelte_raw_text
. This at least makes the JS recognize bar
and baz
as shorthand object literal properties. (Still wrong, but less so.)
This is such an awkward problem because it breaks some assumptions of injections. A typical injection is saying, “hey, come parse this thing, because you know how to parse it and I don't.” All you know about the thing is that it's Language X, and you don't know how to parse it.
This is us saying “hey, come parse this thing, but parse it as though you're in the following situation…” — and any way of describing that situation involves knowing implementation details of the parser. My ideal feature here would be a way of desginating an arbitrary “entrance rule” for an injection — in this case, it'd be formal_parameters
instead of program
— but now there's a translucent window in what used to be a black box.
I'm honestly curious about how the Svelte folks want editors to solve this. I hope they've thought about it at least a little bit.
You're right, this isn't a simple issue to solve, even if we go with option 1. we will still need to replicate the typescript highlights.scm
, which I'd ideally like to avoid for the reasons related to maintinance overhead.
Perhaps my issue in tree-sitter may address it
I said it before, I'll say it again: Tree-sitter is simply a poor fit for templating languages, and you'll be fighting the tool all the way. Since you're coming from Zed, you may have Max's ear so you could try to explain (very generally!) what is lacking from tree-sitter; maybe he'll see a way around it.
(For the record, 2. seems to be the most popular option -- see glimmer-javascript etc.)
@savetheclocktower just linking this issue comment here
Currently, typescript thinks that this is a function call expression, so the child
is being highlighted as a function.
https://github.com/zed-industries/zed/issues/18033#issuecomment-2359383981
So if i get this correctly, there's no way to make a query for functions, classes, comments etc inside script tags? Was just trying to make some queries for those but didn't understand why nodes under (raw_text) & (svelte_raw_text) were invalid. This issue is the first one to show up on google when looking this up.
Kinda sucks because treesitter queries is a massive part of my workflow.
So if i get this correctly, there's no way to make a query for functions, classes, comments etc inside script tags? Was just trying to make some queries for those but didn't understand why nodes under (raw_text) & (svelte_raw_text) were invalid. This issue is the first one to show up on google when looking this up.
Kinda sucks because treesitter queries is a massive part of my workflow.
No I believe that should be possible, however you would need to target the typescript injected syntax instead of svelte.
Did you check the tree-sitter docs?
Is your feature request related to a problem? Please describe.
Currently, we have a few places where language injections will fail due to the nature of how tree-sitter works.
As far as I am aware, it does not seem possible to have tree-sitter treat a specific injection as a subnode within a sublanguage.
Let's take the following example:
If we treat the attribute_value inside of the generics attribute as
typescript
, this will error in the typescript parser, sinceT extends string
is not a valid top-level typescript statement.This occurs in various other locations.
For example, we will want to treat the function in here as a function definition
or likewise here as a function call expression
This would bring the svelte tree-sitter more inline with how the
tsx
treesitter extends the typescript language definition to become a more precise and accurate parser that removes some of the complications around expression parsing.Describe the solution you'd like
Per @savetheclocktower's suggestions (https://github.com/zed-industries/zed/pull/17529#issuecomment-2347860071), there are a few ways to implement support this.
I'm not so sure whether something like
3.
is possible with treesitter unfortunately, therefore I think going with1.
will be best.Describe alternatives you've considered
No response
Additional context