Open Boddlnagg opened 6 years ago
Note to myself: XML element name rule can be found at http://www.xml.com/pub/a/2001/07/25/namingparts.html. Notably,
[4] NameChar ::= Letter | Digit | '.' | '-' | '_' | ':' | CombiningChar | Extender [5] Name ::= (Letter | '_' | ':') (NameChar)*
Our code at findTags.js
currently look for tags using RegExp
(reversed):
const TAG_PATTERN = />([0-9A-Za-z:\-_\.]*[A-Za-z\_])\/<|>\/[^\/]*?([0-9A-Za-z:\-_\.]*[A-Za-z\_])<|([0-9A-Za-z:\-_\.]*[A-Za-z\_])</g;
In order to allow extended NameChar
, we might need to find tags without RegExp
because the resulted pattern could be very large and unmaintainable.
For this XML snippet:
<straße>foobar
... the extension creates the following closing tag:<straße>foobar</stra>
The file is encoded as UTF-8, and this should work.