Open boghyon opened 5 years ago
You are right, <Person AgeCategory=">3" ></Person>
is valid XML/HTML.
So this is a bug.
FYI, here's the part that handles HTML/XML markup:
And the offending regular expression that matches tags is this one:
['lang-in.tag', /^(<\/?[a-z][^<>]*>)/i]
The pattern captured by this is then forwarded to the 'lang-in.tag' handler which in turn executes on the parts inside, to decorates the tokens inside by its own rules like:
[PR_ATTRIB_VALUE, /^(?:\"[^\"]*\"?|\'[^\']*\'?)/, null, '\"\'']
[PR_TAG, /^^<\/?[a-z](?:[\w.:-]*\w)?|\/?>$/i]
[PR_ATTRIB_NAME, /^(?!style[\s=]|on)[a-z](?:[\w:-]*\w)?/i]
Given the first regexp above /^(<\/?[a-z][^<>]*>)/i
, you can see how it would correctly match something like <tag name="val">
, but breaks for something like <tag name=">val">
:
Hence why you must escape <
and >
inside attribute values, really for code-prettify's sake, not the W3C specs :)
I feel like this should be mentioned in a FAQ somewhere; code-prettify does not implement a full-blown parser, it simply attempts to do syntax highlighting using regular expressions. I say "attempt" because it cannot correctly highlight every piece of code using only regexps. But on the web and for the purpose of presenting snippets of code, small highlighting errors are usually acceptable given the speed and small-size gains compared to implementing a full parser for every language supported.
This issue is similar to https://github.com/google/code-prettify/issues/339, but this time it's about
>
instead of<
.<
) MUST be escaped. (no problem)>
), however, doesn't need to be.Borrowing @amroamroamro's example, you can see here that this document is valid
Using prettify, the highlighting gets unfortunately broken.