Semantic input terminators for highlighting code in multiline strings

ermik commented 3 years ago

Current Version

0.21.0

Use-cases

When working with Terraform, it is sometimes beneficial to inline certain configuration which could be linked with local_file or generated from template_file, the simplest examples of which would be a shell script or some simple yaml. However, this is discouraged entirely by the lack of highlighting of those statements, which are plaintext (string) from editor's perspective.

Attempted Solutions

The existing templating and file reference utilities are the workaround to enable highlighting by using a separate file.

Proposal

Use "here doc" multiline input termination strings as a designation of target language/file extension, much like I will type

```tf

to identify the syntax of the statement in the code block as Terraform-flavour of Hashicorp Configuration Language (which coincidentally will fail, because it's an alias for plain HCL and multiline sting support isn't implemented in GitHub markdown's reading of HCL):

  content=<<-YAML
    apiVersion: policy/v1
    kind: PodDisruptionBudget
    metadata:
      name: ${var.name}
    spec:
      minAvailable: 2
      selector:
        matchLabels:
          app: ${var.app}
  YAML

we could use the <<-YAML statement (which currently is already recognized and highlighted) to designate the contents of the multiline sting as valid YAML by presenting it to the VS Code as a content for a file with .${lower(TERM)} (.yaml) extension. The editor could then spin up a secondary LS if available.

Oh, but what about template strings inlined in that string — well, I'm glad you ask. Since most languages out there would be stumped by ${ appearing in a random place, I'd suggest those strings are stubbed with the best-effort rendering of the expected content and then expect those stubbed values in secondary LS output and modify accordingly. This may be the case where highlighting will clash making it difficult to visually identify an HCL template string, so it is a debatable issue.

Related LSP methods

References

radeksimko commented 3 years ago

Using HEREDOC delimiters sounds like a good idea to me!

Unfortunately the language indication is just one of the problems we need to tackle to enable formatting of embedded configs.

The LSP doesn't currently have a documented mechanism or best practices for how to tackle this. In other words there isn't any way for the server to tell the client that a particular range within a particular document should be formatted as YAML (or any other language).

In LSP documents are generally treated as having a single language ID: https://microsoft.github.io/language-server-protocol/specifications/specification-3-17/#textDocumentItem

There are however some early ideas/attempts in https://github.com/microsoft/language-server-protocol/issues/1252

radeksimko commented 3 years ago

Actually there may be some concepts at least documented here https://code.visualstudio.com/api/language-extensions/embedded-languages - but I'm not sure how this would work in other editors and whether it's standardized at all on LSP level.

radeksimko commented 3 years ago

Aside from LSP support though I do think we can have the planned static HCL grammar recognize the HEREDOC delimiters as you proposed - which would enable the highlighting at least.

TextMate grammars specifically seem to have a mechanism for embedding: https://macromates.com/manual/en/language_grammars

ermik commented 3 years ago

I've looked at the implementation of Markdown fenced block highlighting, and I'm sorry to report that its solution to this problem seems to be a continuation of the beautiful hack that the markdown library they depend on (markdown-it) had pioneered. They simply delegate the syntax inside the "fenced" block to highlight-js.

Now, I pass no judgement. It is a decent workaround and certainly makes for a head start on any (functional) proof of concept that can be iterated on.

hashicorp / terraform-ls