remarkablemark / html-dom-parser

📝 HTML to DOM parser.
https://b.remarkabl.org/html-dom-parser
MIT License
88 stars 19 forks source link

Add Deno support #181

Open Stephan-C opened 2 years ago

Stephan-C commented 2 years ago

Expected Behavior

html-dom-parser uses a server dom parser implementation on deno, similar to nodejs. But with guards in place if there is no document defined.

Actual Behavior

Throwing exception that document.implementation doesn't exist. There is no document defined on deno.

Steps to Reproduce

Import html-dom-parser on deno. In my case I used a library that uses html-react-parser, which uses html-dom-parser underneath.

Reproducible Demo

Save this to a file like test.js and then run it on deno: deno run test.js

import parse from 'https://cdn.skypack.dev/html-dom-parser';

console.log(parse('<p class="foo" style="color: #bada55">Hello, <em>world</em>!</p>'));
remarkablemark commented 2 years ago

@Stephan-C html-dom-parser should be loading htmlparser2 on the server-side, which doesn't use document.

Does Deno end up using the browser field from package.json?

Stephan-C commented 2 years ago

Interesting. Yeah, I guess it must be loading the browser field, which might be caused by how the package is imported into deno, I have tried skypack, esm.sh and jspm.

Stephan-C commented 2 years ago

I did some more investigation and came across the following: Looks like Rollup sets the browser one as the default and node is separate. Hence why Deno is loading the browser version. The only reference I could find to something similar was here https://github.com/rollup/rollup/pull/3634#issuecomment-643811723

I sort of found a workaround, but I don't know how to apply this exclusively to html-dom-parser, for a library that depends on html-dom-parser. esm.sh allows you to set a target. This works in deno:

import parse from 'https://esm.sh/html-dom-parser?target=node';

console.log(parse('<p class="foo" style="color: #bada55">Hello, <em>world</em>!</p>'));

So the real question is, how can we configure rollup to serve the server version for deno targets and not the browser version.

remarkablemark commented 2 years ago

I was able to reproduce the error in https://replit.com/@remarkablemark/html-dom-parser-181#index.ts

I believe a potential fix would be to add ES Module support for this package. See https://unpkg.com/html-dom-parser@1.0.4?module:

Package html-dom-parser@1.0.4 does not contain an ES module

The approach will be similar to:

remarkablemark commented 2 years ago

Added ESM support for html-dom-parser in https://github.com/remarkablemark/html-dom-parser/pull/203#issuecomment-1030494326

remarkablemark commented 2 years ago

Upgraded html-dom-parser in https://github.com/remarkablemark/html-react-parser/pull/444#issuecomment-1030529360

Stephan-C commented 2 years ago

Nice. Unfortunately, it still loads the browser version in deno if you try the new 1.1.0 version.

remarkablemark commented 2 years ago

Got it, then I'm not exactly sure what can be done to resolve this.