taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.11k stars 107 forks source link

Unordered lists don't parse correctly #154

Closed santsys closed 3 years ago

santsys commented 3 years ago

Given the HTML:

<ul><li><p>Before Strong <strong>Test 1:<strong> After Strong</p><ul><li>test 2</li></ul></li></ul>

the parsed output is:

<ul>Before Strong <strong>Test 1:</strong> After Strong<ul><li>test 2</li></ul></ul>

I would expect the li elements to be properly parsed and included...

Sample code:

const parser = require('node-html-parser')
const html = '<ul><li><p>Before Strong <strong>Test 1:<strong> After Strong</p><ul><li>test 2</li></ul></li></ul>'
const parsedContent = parser.parse(html, {})
console.log(parsedContent.toString())
nonara commented 3 years ago

Looks like your HTML is invalid. strong isn't closed. Please try with valid HTML and let us know if it's still not working.

santsys commented 3 years ago

@nonara - :( - Good catch! my mistake there!