taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.11k stars 107 forks source link

Broken HTML isn't parsed correctly #241

Closed bennbollay closed 1 year ago

bennbollay commented 1 year ago

node-html-parser v6.1.5 Node v18.12.1

This may be a Works As Designed ticket, in which case feel free to close as appropriate.

The following HTML does not parse correctly:

nhp = require('node-html-parser')
console.log(nhp.parse('<html><body id="main"><div></body></html>').toString())
// Output: '<html></html>'
// Expected: '<html><body id="main"><div></body></html>'
nonara commented 1 year ago

Thanks for the report. This is working as designed. Tradeoff for speed.

Per the readme:

For this reason, some malformatted HTML may not be able to parse correctly, but most usual errors are covered (eg. HTML4 style no closing

  • , etc).