Closed georgecrawford closed 3 years ago
Aah, I've just found https://github.com/apostrophecms/sanitize-html/issues/79 and https://github.com/fb55/htmlparser2/issues/156#issuecomment-590067942, which together led me to upgrade sanitize-html
and this is now fixed. Sorry not to have tested with the latest version!
Hi,
Forgive my ignorance, but I'm using this library (via
sanitize-html
, but that's irrelevant) and I was surprised to see that a string of<1 people affected
was 'sanitized' to an empty string. The issue lies in htmlparser2 as far as I can see, in that no events are fired exceptonend
when parsing this string, so no text can be captured.If I set the HTML of a document to
<1 people affected
, browsers treat it as invalid HTML and display the same string as plain text. What is the expected behaviour of htmlparser2 in this case? If it's not designed to work with invalid HTML, is there a way that I can determine that the HTML is invalid, or in some other way reproduce what browsers tend to do?Since this is not a dangerous string to display in a browser, I would like
sanitize-html
to return the original string with no changes, but it can't do that if htmlparser2 doesn't call any events.I'm very willing to admit that I've missed the point of one or more of these libraries, so please feel free to help me understand!