Open NTaylorMullen opened 4 years ago
Assigning to @alexandrudima who works on the new API
The approach of having a token covering the entire range of the embedded language won’t work universally. For example in XSLT text-value-templates, the}
. This character can also appear unescaped within the embedded XPath language, it is the XPath syntax tree that determines when the embedded expression has terminated.
For cases like this, there needs to be a way to handoff to the embedded language server but then for this embedded language server to hand control back to the host language.
Sufficient context needs to be passed to the embedded language server also, for example to indicate if the text-value-template occurs within an XML CDATA section, or to differentiate it from an attribute-value-template, this context should be passed back to the host language server when returning control.
Having said this, if it were found that an embedded language-range token covered 90% of cases, this could still be valuable.
Given that the last comment was more than two years ago I'll risk a little +1 - I'm working on an extension to allow injecting pegjs syntaxes into VSCode (using the Semantic Highlight API as a kind of dynamic grammar), but the lack of embedded language support severely limits what I can do.
For example, PegJS syntaxes themselves allow snippets of JS to be registered; achieving this currently requires to setup the embedded language in the tmGrammar.json file, which prevents dynamic language embedding. I tried to workaround the problem using semanticTokenScopes
in an attempt to give an embedded language scope to my semantic tokens, but that didn't seem to work.
Piling on here, as I think the overall ask is for a way to enable semantic processing of embedded languages...
What I want to add is an extension that looks at comments and tries to determine if they're worth reading or not. E.g. how many times have you seen this:
/// <summary>Gets or sets the FrobKnocker for this instance.</summary>
public FrobKnocker FrobKnocker { get; set; }
It's a comment alright, but it adds little value.
The way to implement this that seems open would be to create an injected language for comments and then have that language inspect the comment and its immediate surroundings to determine if there's any real added information there or not.
There might be other ways to implement something like that, aside from language support (e.g. as a static analysis layer, perhaps?) But there are other aspects of comments that are truly language concepts and they're effectively an injection - e.g. you can use doxygen in any language, and there are several competitors in that space.
Overview
Languages like Razor (and I imagine HTML for custom attributes) typically have scenarios where portions of the document are semantically a different language.
In Razor this happens frequently through the use of TagHelpers or in Blazor:
In this example we'd expect that
ViewBag.ShouldRenderAntiforgery
would be C#. The way TagHelpers (things that apply to HTML and change the semantic language of the right hand side of an attribute) can be customized by users is limitless so we need to have full control over telling the IDE what things are C# and what things aren't.Ideas on how to implement
Over in the issue where the discussion of general semantic colorization the proposal was to have an API similar to:
The proposed approach can also be used to enable semantic language colorization without enabling an entire language's extensions for a subset of a document.
To do this one could expand on the
ThemeDefinition
and add alanguage
parameter:This would enable LanguageServers to mark a chunk of text in an editor as a Token that associates with a specific language. This would work similarly to how tooltips/completion descriptions etc. work when specifying pieces of text that should be colorized as a certain language.
So in the first example
asp-antiforgery="ViewBag.ShouldRenderAntiforgery"
Razor's language server would indicate that the entireViewBag.ShouldRenderAntiforgery
was of a Razor specific theme that had a token Style of: