taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.11k stars 107 forks source link

querySelectorAll does not work as expected #259

Open PacoVK opened 10 months ago

PacoVK commented 10 months ago

Description

Given the input

<div class="sect2">
    <ul>
        <li>
            <p>Update </p>
            <ul>
                <li>
                    <p>login with credentials</p></li>
                <li>
                    <p>do something</p>
                    <p>
                    <div class="content">
                   <pre class="highlightjs highlight">
                       <code class="language-bash hljs" data-lang="bash">systemctl stop service
            systemctl start service</code>
                   </pre>
                    </div></p></li>
            </ul></li>
    </ul>
</div>

when i call

const content = parse(inputGivenAbove)
const codeElements = content.querySelectorAll('pre > code') 

I would expect

codeElements.length === 1

actually it is

codeElements.length === 0

Context

Node 21.2.0

"devDependencies": {
    "@jest/globals": "^29.7.0",
    "@types/node": "^20.10.0",
    "jest": "^29.7.0",
    "ts-jest": "^29.1.1",
    "ts-node": "^10.9.1",
    "typescript": "^5.3.2"
  },
  "dependencies": {
    "node-html-parser": "^6.1.11"
  }
PacoVK commented 10 months ago

Seems like the parse needs further configuration here. After using

const content = parse(inputGivenAbove, {blockTextElements: {code: true}})
const codeElements = content.querySelectorAll('pre > code') 

the output is now as expected

JeffML commented 1 month ago

const content = parse(inputGivenAbove, {blockTextElements: {code: true}}) const codeElements = content.querySelectorAll('pre > code')

The option {code: true} does not appear in the README