validator / htmlparser

The Validator.nu HTML parser https://about.validator.nu/htmlparser/
Other
56 stars 26 forks source link

Remove XML 1.0 mappability warning for XmlViolationPolicy.ALLOW #3

Closed zcorpan closed 3 years ago

zcorpan commented 7 years ago

See http://logs.glob.uno/?c=freenode%23whatwg&s=5+Dec+2016&e=5+Dec+2016#c1013996 for discussion.

hsivonen commented 7 years ago

The discussion link doesn't show any discussion, at least not for me.

Despite being unhappy about it, I can see the rationale for making it conforming to have double hyphen in a comment. However, removing the ability to report the XML-incompatibility as a non-error seems not OK. While XML is no longer cool, the body of XML processing infrastructure is still vast.

I don't think we should optimize for authors being able to see message-free results at the expense of suppressing information that still has relevance in some cases. (The same goes for the removal of the information about putting IE in the quirks mode, too, IMO.)

hsivonen commented 7 years ago

(The same goes for the removal of the information about putting IE in the quirks mode, too, IMO.)

Well, maybe less so for the IE thing, since the relevant IE versions are at 1.5% global usage share according to StatCounter.

zcorpan commented 7 years ago

Sorry, fixed the link.

hsivonen commented 7 years ago

The discussion cites the level of usage of the XML syntax on the Web, but the warning is not about the XML serialization. It's about using app-internal representation that has been designed for XML with the text/html serialization.

zcorpan commented 7 years ago

Yeah. The only indication about such usage I know if is

10:05 MikeSmith as maybe does the “Doctype with「SYSTEM "about:legacy-compat"」found” figure at validator.w3.org/nu/stats.html 10:05 MikeSmith which is 0.06%

which likely both contains cargo-cult using that doctype without XML tools, and misses cases of people using XML tools with text/html serialization but does not use that doctype.

In any case, I think these warnings are useful when working with XML tools, but useless for everybody else. If one works with XML tools, why should the warnings be in the conformance checker, as opposed to in the tools themselves?

When parsing HTML into an XML-centric system, it is often better to use ALTER_INFOSET, which still has the warnings. "ALLOW" I think indicates that one is not affected by or does not care about XML violations.

zcorpan commented 3 years ago

@sideshowbarker did you mean to close this?

sideshowbarker commented 3 years ago

@sideshowbarker did you mean to close this?

It seems it got closed automatically when I deleted the validator-nu branch — because for some reason it ended up only on the validator-nu branch, and not in any other PR branch.

https://github.com/zcorpan/htmlparser/commit/731883831e181e260a1106f4586a0c541e4b49b1?w=1 is the specific change. I think it probably needs review from @hsivonen. It may be that we discussed it already, but at this point, I’ve forgotten.