tree-sitter / tree-sitter-javascript

Javascript grammar for tree-sitter
MIT License
314 stars 108 forks source link

Support HTML entities in JSX text/attributes #284

Closed cpmsmith closed 3 months ago

cpmsmith commented 4 months ago

JSX text and attributes support HTML character references (a.k.a. entities), and don't support ECMAScript string escape sequences.

Although the spec calls it "historical" and threatens to change it, it is in the spec, and the spec is pretty stable at this point.

In changing this, I landed back on an idea that @maxbrunsfeld suggested in a PR review some time ago: having separate string and jsx_string nodes, and aliasing jsx_string to string for consumers' convenience. At that time, having two different node types was deemed unnecessary, but this adds a second, more substantive difference between the two, so I've brought the idea back, and stopped allowing invalid newlines in JS string literals, which is invalid in both JS and TS.

TL;DR

Here is some JSX highlighted in Neovim using tree-sitter:

image

And here it is in VSCode, not using tree-sitter:

image

VSCode, correctly, does not highlight the \n in the JSX attribute, and does highlight the two valid  s, which tree-sitter-javascript doesn't currently parse. This PR fixes both things.

Checklist:

amaanq commented 3 months ago

I'm going to build on top of this with a fix for jsx vs js strings as well, thank you for the PR!

amaanq commented 3 months ago

Hey again @cpmsmith, I cherry picked your changes onto #291, thank you for the PR!