taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.12k stars 112 forks source link

Kind of HTML errors that break the library #105

Closed ghost closed 3 years ago

ghost commented 3 years ago

The readme mentions that malformed errors might break the library. Can you give examples of html that might cause issues?

taoqf commented 3 years ago

You may need this https://github.com/taoqf/node-html-parser/discussions/102.

ghost commented 3 years ago

@taoqf I was hoping for an example of html file that would break it.

taoqf commented 3 years ago

I am so sorry that I don't get much time on this.

nonara commented 3 years ago

Hi! Just wanted to add a quick answer on this.

There is currently one type of error which can cause an improper output, which is malformed HTML by not closing tags.

For example:

<tagA>
<tagB>
<tagC>
<tagD>
</tagB>
</tagA>

Not properly closing a tag can cause issues in the parse output. This is something that I hope to address in the near future.