Closed galenhuntington closed 1 year ago
Thank you for the report.
It's a htmlparser2
issue. I opened the issue upstream: https://github.com/fb55/htmlparser2/issues/1426
For now, it should be possible to set decodeEntities
to false
and deal with entities afterwards.
html-to-text
before version 9.0.0 should also be unaffected.
Upstream issue is fixed in htmlparser2
8.0.2.
And I published html-to-text
9.0.5 with updated dependencies.
Minimal HTML example
Observed output
Expected output
Version information
The problem seems to arise with either a
<script>
or<style>
block, followed by text with an entity (&blah;
). HTML afterwards gets copied over literally without any further text conversion.