taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.12k stars 112 forks source link

node.d.ts in npm is old or broken? #210

Closed IAkumaI closed 2 years ago

IAkumaI commented 2 years ago
import {parse} from 'node-html-parser';

const root = parse('<div class="item">Some text</div>');

root.getAttribute('class'); // undefined. root - is some kind of virtual? ok
root.firstChild.getAttribute('class'); // error - Node does not has getAttribute method
root.querySelector('.item').getAttribute('class'); // Works, but I have to know selector. This is not a solution.

Hi. How can i get attribute from first root element? I found https://github.com/taoqf/node-html-parser/blob/main/test/tests/quoteattributes.js#L26 is tests, but root.firstChild is Node, so it does not has getAttribute method.

I am using version 5.3.3 (latests from npm).

What am I doing wrong?

IAkumaI commented 2 years ago

Oh, I see. getAttributes do exists in Node. But TypeScript think is don't, because of node.d.ts:

export default abstract class Node {
    parentNode: HTMLElement;
    abstract nodeType: NodeType;
    childNodes: Node[];
    range: readonly [number, number];
    abstract text: string;
    abstract rawText: string;
    abstract toString(): string;
    abstract clone(): Node;
    constructor(parentNode?: HTMLElement, range?: [number, number]);
    /**
     * Remove current node
     */
    remove(): this;
    get innerText(): string;
    get textContent(): string;
    set textContent(val: string);
}

Is it bug or old version or what?

IAkumaI commented 2 years ago

Well. It's just cause childNodes can be Node or HTMLElement and there is no way to know what is got. So, this is not issue. Closed.

taoqf commented 2 years ago

By the way,

const root = parse('<div class="item">Some text</div>');
(root.firstChild as HTMLElement).getAttribute('class');  // should be `item`

should works.

IAkumaI commented 2 years ago

Yeah, it works. It's just TypeScript showing the error "getAttribute does not exists on Node" without casting to HTMLElement.

But. In production you have to check firstChild is real HTMLElement, but not text-Node.

taoqf commented 2 years ago

Indeed.

But. In production you have to check firstChild is real HTMLElement, but not text-Node.

benwoodward commented 1 year ago

Well. It's just cause childNodes can be Node or HTMLElement and there is no way to know what is got. So, this is not issue. Closed.

If the childNodes can be either Node or HTMLElement, does that not mean the typing for .firstChild(and others) is incorrect?

    /**
     * Get first child node
     * @return {Node} first child node <- should this not be `Node | HTMLElement`?
     */
    public get firstChild() {
        return this.childNodes[0];
    }

https://github.com/taoqf/node-html-parser/blob/a439a96d3b7e934f13e5795f5d16ad4a2a10da3c/src/nodes/html.ts#L641