codemirror / dev

Development repository for the CodeMirror editor project
https://codemirror.net/
Other
5.46k stars 343 forks source link

Wrong comment format in JSX blocks #1075

Closed sergeichestakov closed 1 year ago

sergeichestakov commented 1 year ago

Describe the issue

It's nice that you can add JSX support for JS files with javascript({jsx: true}) but one papercut that we've been running into when using it is that the wrong comment format is used with the toggle comment command/shortcut in JSX blocks. In this context, it still uses the default // line comment in js, but it should be using the block comment format wrapped in braces like so: {/* ... */}. Otherwise, there will be a compiler error as having // in JSX blocks is not valid but block comments wrapped in braces is.

See demo:

cm-comment-jsx

Also happy to take a stab at it myself and open a PR if you're busy! I think the solution here involves using an extension that sets the languageDataAt while the cursor is in a JSX block, the hard part is figuring out when you're in that context (still a bit of a Lezer noob and JSX blocks can get quite complex) and also to know when the line is already commented with that format, so any tips would be appreciated!

Browser and platform

Firefox 109.0.1, MacOS Ventura 13.0.1 (but should repro on all browsers/devices)

Reproduction link

https://cm-jsx-comment.sergeichestakov.repl.co/

marijnh commented 1 year ago

The main obstacle to this is that comment tokens (and language-data in general) are currently per-language, but the JavaScript language, with JSX, includes two different contexts with their own commenting style.

We could use the languageData facet directly to define more complicated logic in this case, but that's going to be rather inefficient, since those providers are called without knowing anything about the query, and thus will run their logic (including resolving the position in the syntax tree, which the default behavior already does) for every call to languageDataAt, even if unrelated to comment tokens.

Maybe we can extend the language-data interface to make this kind of thing better to express, I'll have to think about it a bit more.

marijnh commented 1 year ago

Attached patches add support for overriding language data within a language to @codemirror/language and use it in @codemirror/lang-javascript to improve commenting of JSX. Let me know if this works for you.

sergeichestakov commented 1 year ago

Oh wow looks like that helps! Awesome fix. Only thing is it appears untoggling doesn't quite work as expected in most cases. It looks like it fails to recognize that it's still in a JSX context once the comment is actually inserted.

See repro (via the link in my PR description):

jsx-toggle-comment

marijnh commented 1 year ago

I think that required another fix which wasn't released yet. Try with @codemirror/commands 6.2.1

sergeichestakov commented 1 year ago

hmm just tried and it seems that regressed the initial fix 😕 from above Repl:

cm-comment-demo

marijnh commented 1 year ago

In that context, JavaScript-style comments are valid, aren't they? The start of the line isn't in JSX context yet, so you can comment it using //.

sergeichestakov commented 1 year ago

No if it's within the parenthesis, it's not valid since that indicates the bounds of the JSX context. If it were inline (e.g. return <div></div>;) then it would be.

sergeichestakov commented 1 year ago

Other than that case though, it seems to be working well!

marijnh commented 1 year ago

The parentheses aren't part of JSX as far as I can see—people for some reason put them around JSX expressions, but those are just regular JS parentheses, no?

sergeichestakov commented 1 year ago

No when returning JSX inside a functional component in React, everything in the parenthesis is parsed as JSX which means // style comments are invalid.

Here's an example of what I mean, including the inline case when // comments are valid:

jsx-comment-test

Easy to repro if you fork this Repl (assuming you have a Replit account 😄)

marijnh commented 1 year ago

No when returning JSX inside a functional component in React, everything in the parenthesis is parsed as JSX which means // style comments are invalid.

How would that work, syntactically speaking? The parser doesn't know you're in a React component, so parentheses are parsed as normal.

The reason that becomes invalid is that it produces an empty pair of parentheses. There's no guarantee that commenting out lines will leave your code syntactically valid.

sergeichestakov commented 1 year ago

Hmm fair enough. I guess the only way to differentiate is if you have JSX on the same line inside the parenthesis.

I guess I was confused bc my local editor (Neovim) handles that case as I described (although un-toggling still inserts //) but it looks like VS Code and other Monaco based editors do not so I suppose there's some ambiguity there and this can be considered WAI in which case we can close this out!

sergeichestakov commented 1 year ago

as an aside, do you think it's worth exporting the jsxSublanguage object? we define our own custom JS languge provider (rather than using javascript({jsx: true}) so this does not work out of the box even with the dialect set to jsx (although maybe that should be fixed directly). obv easy to recreate but might make it easier for others who do the same thing.

sergeichestakov commented 1 year ago

Also I do find the inconsistency here kinda weird (notice the last div is correctly commented but the first is not):

cm-jsx-comment-inconsistency

Perhaps this case is worth addressing?

marijnh commented 1 year ago

we define our own custom JS languge provider (rather than using javascript({jsx: true})

I think in that case you're responsible for configuring it as well. The jsxSublanguage object seems a bit too obscure to export, and as you mentioned, easy enough to duplicate.

Also I do find the inconsistency here kinda weird (notice the last div is correctly commented but the first is not):

Linewise commenting looks up the language context at the start of the line, because that's where it is going to be inserting comment tokens. This can still go wrong of course (say, if you end up inserting block comment tokens on a line that's a language boundary), but having custom intelligent commenting routines for every kind of context is more than I'm willing to commit to—and more generally, a 'comment this line' feature that never introduces syntax errors seems tricky to design.

sergeichestakov commented 1 year ago

Looks like this breaks commenting in large selections that include both JSX and regular JS. I've opened a separate issue here: https://github.com/codemirror/dev/issues/1105