capricorn86 / happy-dom

A JavaScript implementation of a web browser without its graphical user interface
MIT License
3.09k stars 185 forks source link

Constructor and attribute mismatch while parsing HTML #1386

Open lukasbash opened 3 months ago

lukasbash commented 3 months ago

Describe the bug Given the following code for parsing a paragraph element:

import { Window, DOMParser } from "happy-dom";

function tryHappyDom() {
  console.log("\nTrying to parse HTML with HappyDOM");

  const window = new Window();
  const html = new DOMParser(window).parseFromString(
    '<html><body><p align="center">test</p></body></html>',
    "text/html"
  );
  const p = html.querySelector("body")!.childNodes[0];

  console.log("Element constructor name: ", p.constructor.name);
  console.log("Element align: ", p.align);
  console.log("Element align via getAttribute: ", p.getAttribute("align"));
}

tryHappyDom();

The output becomes:

Trying to parse HTML with HappyDOM
Element constructor name:  HTMLElement
Element align:  undefined
Element align via getAttribute:  center

If you compare the parsing to e.g. JSDOM, you will have notable differences, as the script:

import { JSDOM } from "jsdom";

function tryJSDOM() {
  console.log("\nTrying to parse HTML with JSDOM");

  const jsd = new JSDOM('<html><body><p align="center">test</p></body></html>');
  const p = jsd.window.document.querySelector("body")!.childNodes[0];

  console.log("Element constructor name: ", p.constructor.name);
  console.log("Element align: ", p.align);
  console.log("Element align via getAttribute: ", p.getAttribute("align"));
}

tryJSDOM();

The output becomes:

Trying to parse HTML with JSDOM
Element constructor name:  HTMLParagraphElement
Element align:  center
Element align via getAttribute:  center

To Reproduce Steps to reproduce the behavior:

  1. Execute both snippets above via the bun runtime and see huge parsing differences.

Expected behavior The constructor signature should match the actual parsed element. Attributes which are (even if deprecated) available on specific element implementations (according to MDN) should be set on the class instance and not only be available via attribute getters.

Screenshots

Device:

Additional context I am aware of the fact that libraries might not be comparable in all manners, but missing precise constructors and omitting attributes might be a showstopper for me. Is there maybe another way of parsing the string, so that everything works out they way I expect it? If not, it might be worth some notes in the docs, doesn't it?