I'm not sure if this is a bug or working as intended according to the HTML5 parsing
algorithm, but it seems at least problematic from a user's perspective.
When parsing an HTML document that contains <script> tags, writing out the tokens
received will double escape any contained entities, thus <script> tags don't
round-trip through the tokenizer. See the attached patch which adds two tests for
<script>"</script> (which leads to  as the contents) and
<script>"</script>, which leads to &#34;.
That means re-parsing the output of tokenization adds more and more double escaping.
There is a test for <style> just below the one I added that makes this look
intentional. But this is a real problem: using go.net/html to parse and re-serialize
documents breaks the documents.
by martin@probst.io:
Attachments: