codemirror / dev

Development repository for the CodeMirror editor project
https://codemirror.net/
Other
5.94k stars 376 forks source link

`indent` function of legacy StreamParser is ignored when it's nested within `@codemirror/lang-html` using `nestedLanguages` #1474

Closed josephrocca closed 2 days ago

josephrocca commented 2 days ago

Describe the issue

Following advice from this issue:

I'm using the legacy JavaScript parser in order to get local/global variables highlighted differently. I've integrated it with @codemirror/lang-html like you can see in the code below. This works great except it doesn't use the indent function of @codemirror/legacy-modes/mode/javascript, so the indentation is incorrect when pressing enter to create a new line.

import {basicSetup, EditorView} from "codemirror"
import {StreamLanguage} from "@codemirror/language";
import {javascript} from "@codemirror/legacy-modes/mode/javascript";
import {html} from "@codemirror/lang-html";

let javascriptStreamLanguage = StreamLanguage.define(javascript);
let eventAttributeNames = [...new Set([...Object.getOwnPropertyNames(document), ...Object.getOwnPropertyNames(Object.getPrototypeOf(Object.getPrototypeOf(document))), ...Object.getOwnPropertyNames(Object.getPrototypeOf(window))].filter(k => k.startsWith("on") && (document[k] == null || typeof document[k] == "function")))];
let htmlLanguageExtension = html({
  nestedLanguages: [{tag:"script", parser:javascriptStreamLanguage.parser}],
  nestedAttributes: eventAttributeNames.map(n => ({name:n, parser:javascriptStreamLanguage.parser})),
});

let view = new EditorView({
  doc: `<script>\n  function test() {\n    "press enter when your cursor is at end of this line"\n  }\n</script>`,
  extensions: [
    basicSetup,
    htmlLanguageExtension,
  ],
  parent: document.body,
});

As you can see, the indenting is incorrect in the above example. If I put a console.log within the indent function here in the legacy JavaScript parser, then I can see that there aren't any logs, so it's not being used/called.

You can compare the above example to the case where we just use the @codemirror/lang-html extension without any nested language overrides:

In this case indenting is correct, of course.

I'm wondering if it is a bug or expected behavior that indentation does not follow the nested language parser's indent function?

Browser and platform

Chrome v129, Ubuntu 22.04

Reproduction link

https://codemirror.net/try/?c=aW1wb3J0IHtiYXNpY1NldHVwLCBFZGl0b3JWaWV3fSBmcm9tICJjb2RlbWlycm9yIgppbXBvcnQge1N0cmVhbUxhbmd1YWdlfSBmcm9tICJAY29kZW1pcnJvci9sYW5ndWFnZSI7CmltcG9ydCB7amF2YXNjcmlwdH0gZnJvbSAiQGNvZGVtaXJyb3IvbGVnYWN5LW1vZGVzL21vZGUvamF2YXNjcmlwdCI7CmltcG9ydCB7aHRtbH0gZnJvbSAiQGNvZGVtaXJyb3IvbGFuZy1odG1sIjsKCmxldCBqYXZhc2NyaXB0U3RyZWFtTGFuZ3VhZ2UgPSBTdHJlYW1MYW5ndWFnZS5kZWZpbmUoamF2YXNjcmlwdCk7CmxldCBldmVudEF0dHJpYnV0ZU5hbWVzID0gWy4uLm5ldyBTZXQoWy4uLk9iamVjdC5nZXRPd25Qcm9wZXJ0eU5hbWVzKGRvY3VtZW50KSwgLi4uT2JqZWN0LmdldE93blByb3BlcnR5TmFtZXMoT2JqZWN0LmdldFByb3RvdHlwZU9mKE9iamVjdC5nZXRQcm90b3R5cGVPZihkb2N1bWVudCkpKSwgLi4uT2JqZWN0LmdldE93blByb3BlcnR5TmFtZXMoT2JqZWN0LmdldFByb3RvdHlwZU9mKHdpbmRvdykpXS5maWx0ZXIoayA9PiBrLnN0YXJ0c1dpdGgoIm9uIikgJiYgKGRvY3VtZW50W2tdID09IG51bGwgfHwgdHlwZW9mIGRvY3VtZW50W2tdID09ICJmdW5jdGlvbiIpKSldOwpsZXQgaHRtbExhbmd1YWdlRXh0ZW5zaW9uID0gaHRtbCh7CiAgbmVzdGVkTGFuZ3VhZ2VzOiBbe3RhZzoic2NyaXB0IiwgcGFyc2VyOmphdmFzY3JpcHRTdHJlYW1MYW5ndWFnZS5wYXJzZXJ9XSwKICBuZXN0ZWRBdHRyaWJ1dGVzOiBldmVudEF0dHJpYnV0ZU5hbWVzLm1hcChuID0+ICh7bmFtZTpuLCBwYXJzZXI6amF2YXNjcmlwdFN0cmVhbUxhbmd1YWdlLnBhcnNlcn0pKSwKfSk7CgpsZXQgdmlldyA9IG5ldyBFZGl0b3JWaWV3KHsKICBkb2M6IGA8c2NyaXB0PlxuICBmdW5jdGlvbiB0ZXN0KCkge1xuICAgICJwcmVzcyBlbnRlciB3aGVuIHlvdXIgY3Vyc29yIGlzIGF0IGVuZCBvZiB0aGlzIGxpbmUiXG4gIH1cbjwvc2NyaXB0PmAsCiAgZXh0ZW5zaW9uczogWwogICAgYmFzaWNTZXR1cCwKICAgIGh0bWxMYW5ndWFnZUV4dGVuc2lvbiwKICBdLAogIHBhcmVudDogZG9jdW1lbnQuYm9keSwKfSk7Cg==

marijnh commented 2 days ago

Attached patch changes the way StreamLanguage registers its indentation function to also work in a nested parser context. However, stream languages are not defined in a way that allows them to be aware of the parent indentation context around it, so though this is an improvement, it'll still sometimes produce an unexpected indentation, especially at the top level.

josephrocca commented 1 day ago

Thank you, Marijn! 🙏

For others using the JavaScript legacy StreamLanguage, a hacky fix the top-level indent issue mentioned by Marijn is to change this line from this:

else return lexical.indented + (closing ? 0 : cx.unit);

to this:

else return lexical.column === 0 ? null : Math.max(0, lexical.indented) + (closing ? 0 : cx.unit);

I haven't extensively tested this, and it was kind of a guess-and-check solution, but it seems to solve indenting in the common top-level cases, at least. I'll update this comment if I find issues with it as I continue to test with more real-world code.