googlearchive / code-prettify

An embeddable script that makes source-code snippets in HTML prettier.
Apache License 2.0
5.78k stars 907 forks source link

Support for JSX #497

Open rbiggs opened 7 years ago

rbiggs commented 7 years ago

I've used prettify for years. But know, with the prevalence of JSX in code, I need it to highlight properly online. Prettify currently breaks badly with JSX. The React/Preact/Inferno community is huge. And Angularjs and Vuejs developers can also use JSX. It's everywhere now. I'm hoping this can be added soon.

mikesamuel commented 7 years ago

Is JSX significantly different syntactically from E4X?

rbiggs commented 7 years ago

There are some differences, but JSX is a dialect of XML, according to Facebook. However, its usually just in the code as a return argument like this:

render(data) => <div id="main">
   <h3>{ data }</h3>
</div>

or inside parens:


render(data) => (
   <div id="main">
      <h3>{ data }</h3>
   </div>
)
mikesamuel commented 7 years ago

Do you have a description of the rules that describe when a < character starts an XML token?

The following JS, not JSX,

var a = 3;
var disabled, href, foo;

var x = 1
<a
href="foo"
disabled>
foo || alert(1)

is equivalent by virtue of semicolon insertion to

var a = 3;
var disabled, href, foo;

var x = 1 < a;

href = "foo";

if (!(disabled > foo)) {
  alert(1);
}

but, if you replace newlines with spaces, that script contains <a href="foo" disabled>foo || alert(1) which is a perfectly valid start of a link tag, so JSX must have some way to determine what is the start of an XML section that doesn't require scanning the entire remainder of the input.

rbiggs commented 7 years ago

Not really, either you return some JSX or you enclose it in parens. The Babel JSX transformer plugin identifies the JSX and converts it into a function that is used by frameworks to create the actual markup. But that's only after running a build script. So in your file you would have like my previous examples. The general practice is to enclose JSX in parens. By the way, here are links to Facebooks documentation on JSX: https://facebook.github.io/react/docs/introducing-jsx.html, https://facebook.github.io/react/docs/jsx-in-depth.html

NealEhardt commented 6 years ago

Here's the coverage report for a JSX file. Generated by code-prettify as bundled with istanbul.

screenshot 2017-08-08 17 26 05

Code is covered by this test file (everything is run through Babel):

import React from 'react';
import { shallow } from 'enzyme';
import Example from '../../../../src/common/Example';

describe('Example', () => {
  it('renders without error', () => {
    const wrapper = shallow(<Example />);
    console.log(wrapper.html());
  });
});

...its console output shows that it rendered this HTML:

<div class="plain">Free text</div>

There a few issues with the way code-prettify reports on Example.js:

  1. The yellow "branch not covered" span for 'buffalo' should end at :. Instead, it extends all the way to the end of the file.
  2. Lines 6, 7, and 12 are covered and deserve their own labels in the left gutter.
  3. Lines 8, 9, and 10 are not covered. The figure 100% Statements 10/10 is wrong.

Further reading on JSX:

jrunning commented 4 years ago

@mikesamuel

Do you have a description of the rules that describe when a < character starts an XML token?

Here is some relevant discussion in the acorn-jsx repo that begins with an example very similar to yours: acornjs/acorn-jsx#78.

In this comment @RReverser draws an analogy to parsing regular expression literals with respect to ASI (emphasis mine):

Although JSX does not have explicit ASI semantics for its own operators, normal ECMAScript grammar still applies, which means ASI should kick in only when without it a syntax error would happen.

The case above with < is similar to / ambiguity at the beginning of the line, and so I believe this behaviour is actually correct. Consider the following example:

var foo = {};
/TodoList/

vs

var foo = {}
/TodoList/

In both cases - whether with < or with /, only explicit semicolon can turn a character that is normally an operator into a beginning of an expression, which makes the following content to be parsed differently - division vs regex or less-than vs start element.

The JSX repo has some additional discussion here: facebook/jsx#87.

Finally, I don't have the expertise to tell if this is useful, but FWIW the draft JSX specification says:

JSX extends the PrimaryExpression in the ECMAScript 6th Edition (ECMA-262) grammar:

PrimaryExpression :

  • JSXElement
  • JSXFragment

PrimaryExpression is defined as:

PrimaryExpression :
this
IdentifierReference
Literal
ArrayLiteral
ObjectLiteral
FunctionExpression
ClassExpression
GeneratorExpression
AsyncFunctionExpression
RegularExpressionLiteral
TemplateLiteral
CoverParenthesizedExpressionAndArrowParameterList

In other words, a PrimaryExpression is anything we'd informally call an "expression" in JavaScript, and a JSX element or fragment (<>...</>) can begin anywhere a PrimaryExpression can.

I hope that's helpful.