taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.12k stars 112 forks source link

Configuration option to disable auto-closing tags #246

Closed wcauchois closed 1 year ago

wcauchois commented 1 year ago

Would it be possible to add a configuration option to disable the "auto closing tags" functionality that's provided for HTML4 compatibility?

For example in the current version of the library, the HTML

<hr>Hello, world</hr>

parses out as a hr node followed by a text node. I would like that to get parsed as a text node nested inside an hr node.

Maybe we just have to change something here?

https://github.com/taoqf/node-html-parser/blob/d3980c5fb7744d6fda4b270619bb581a19a4cb18/src/nodes/html.ts#L1151

taoqf commented 1 year ago

remove line 924,925 will do

    // hr: true,
    // HR: true,