Closed Shinmera closed 9 years ago
I'm looking into implementing this, but before I begin, let me ask: how XML-compatible do you actually want to be?
I think a reasonable amount of checking would be:
--
does not occur in a comment.My thoughts are as follows:
--
occurs in a comment, signal a warning with the same restart behaviour as before and drop it as default behaviour.I don't really care about attribute name or tag name checking to be honest. I only really care about the former because it's a sneaky issue that has bitten me a couple of times now when I innocently spliced text into a document, and the --
for comments falls into a similar vein. Attributes and tags should not need a check, since those are usually not generated by user content, but instead controlled by the author, who should know better.
I'm also not sure if there are spec differences between HTML5 and XML with regard to tag and attribute names. If they are compatible, then adding a check regardless could be worthwhile, but if one of them is much more lenient than the other I wouldn't bother.
Also: The checks for character range validity should be encapsulated as separate, exported functions (from the plump-dom package) to allow users to easily check or trim the DOM ahead of time, or something like that.
The XML Specification disallows the usage of certain Unicode characters in an output document. Plump should recognise these and properly encode them when serializing the output.