Open rbri opened 5 months ago
I'm afraid it's pretty much a non-starter. There is a very high probability that it would break a large number of users without warning.
An HTML parser parsing an XML document won't always create the same DOM as an XML parser parsing an XML document.
Making it easy to swap out the XMLReader used globally sounds like a good idea though.
@pbrant my guess is, the s saucer is about parsing xhtml and not about arbitrary xml. Maybe you can provide some samples that helps me to understand your point.
@rbri That's not quite accurate. I'd describe Flying Saucer as a W3C DOM renderer that, by default, parses input as XML (not XHTML).
For an example of how the parsing rules differ consider this HTML5/XHTML document which is also valid XML:
<html>
<body>
<p>
one
<div>two</div>
three
</p>
</body>
</html>
An HTML5/XHTML parser will produce the DOM equivalent of the following (taken from DevTools):
<html><head></head><body>
<p>
one
</p><div>two</div>
three
<p></p>
</body></html>
These two DOMs won't render the same in Flying Saucer even with the default stylesheet and since their internal structure differs, user stylesheets might also match differently.
Note these two forks, which have taken steps in supporting html by default. I think FS should move in the same direction.
A fork starting with zero users has a lot more flexibility than a project with hundreds of thousands of downloads a month.
There seems to be no dynamic way to add another try, did this simple hack.
Hope someone with more knowledge about this lib comes up with some better ideas....