taoqf / node-html-parser

A very fast HTML parser, generating a simplified DOM, with basic element query support.
MIT License
1.12k stars 112 forks source link

fix: support esbuild #166

Closed lisonge closed 3 years ago

lisonge commented 3 years ago

fix the following bug

npm i esbuild node-html-parser typescript
echo 'import { parse, HTMLElement } from "node-html-parser";console.log({ parse, HTMLElement });' > test.ts
npx esbuild test.ts --bundle --outfile=test.js
node test.js

will be output { parse:undefined, HTMLElement:undefined }

nonara commented 3 years ago

Thanks for the submission!

In order to use as esm your package.json file needs to have "type": "module" set. Please let me know if that solves the issue.

lisonge commented 3 years ago

your packge default export is https://github.com/taoqf/node-html-parser/blob/main/src/index.ts#L3 is a function

export { default as parse, default } from './parse';

but in https://github.com/taoqf/node-html-parser/blob/main/esm/index.js#L1

import nhp from '../dist/index.js'

and at dist/index.js#L5

Object.defineProperty(exports, "__esModule", { value: true });

and at #L13

Object.defineProperty(exports, "default", { enumerable: true, get: function () { return __importDefault(parse_1).default; } });

so esbuild will use module.exports.default as import nhp from '../dist/index.js'

so nhp is parse function, not module.exports

lisonge commented 3 years ago

Thanks for the submission!

In order to use as esm your package.json file needs to have "type": "module" set. Please let me know if that solves the issue.

I need run node-html-parser in cloudflare workers, without package.json, it is not node runtime so I use esbuild pack my code to a single file. currently I fix this with patches in my project @nonara

nonara commented 3 years ago

your packge default export is https://github.com/taoqf/node-html-parser/blob/main/src/index.ts#L3 is a function

The issue is that package exports are handled differently between ESM and CommonJS.

import nhp from 'node-html-parser' in CommonJS, means import module.exports.default and name it nhp.

In ESM, the same statement means: import the entire module namespace and name it nhp.

Our structure for ESM support follows the proper convention and is not buggy. You can see more detail in this PR:

I need run node-html-parser in cloudflare workers, without package.json

It looks like you don't actually need to be using ESM. If you build to commonJS, you should be fine.

I tried your exact reproduction, and I don't actually experience any issue:

npm i esbuild node-html-parser typescript
echo 'import { parse, HTMLElement } from "node-html-parser";console.log({ parse, HTMLElement });' > test.ts
npx esbuild test.ts --bundle --outfile=test.js
node test.js
{
  parse: [Function: parse],
  HTMLElement: [class HTMLElement extends Node]
}

Do you have a tsconfig.json you can share? If you can setup a repository which can reproduce the issue you're having, I'll take a look and see if I can help.

lisonge commented 3 years ago

Do you have a tsconfig.json you can share? If you can setup a repository which can reproduce the issue you're having, I'll take a look and see if I can help.

tsconfig.json in my project https://github.com/lisonge/visit-counter.git

this project will show a bug, If you don't modify node_modules/node-html-parser/esm/index.js

you see bug by following shell

npx esbuild test/test_import.ts --bundle --outfile=dist/test_import.js
node dist/test_import.js
lisonge commented 3 years ago

I tried your exact reproduction, and I don't actually experience any issue:

my shell Log

❯ cat typescript
Script started on 2021-10-20 10:26:37+08:00 [TERM="xterm-256color" TTY="/dev/pts/0" COLUMNS="120" LINES="30"]
❯ ls -a .
.  ..  install.sh  typescript
❯ cat install.sh
npm i esbuild node-html-parser typescript
echo 'import { parse, HTMLElement } from "node-html-parser";console.log({ parse, HTMLElement });' > test.ts
npx esbuild test.ts --bundle --outfile=test.js
node test.js

❯ ./install.sh

> esbuild@0.13.8 postinstall /home/lisonge/project/test/node_modules/esbuild
> node install.js

npm WARN saveError ENOENT: no such file or directory, open '/home/lisonge/project/test/package.json'
npm notice created a lockfile as package-lock.json. You should commit this file.
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-freebsd-64@0.13.8 (node_modules/esbuild/node_modules/esbuild-freebsd-64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-freebsd-64@0.13.8: wanted {"os":"freebsd","arch":"x64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-freebsd-arm64@0.13.8 (node_modules/esbuild/node_modules/esbuild-freebsd-arm64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-freebsd-arm64@0.13.8: wanted {"os":"freebsd","arch":"arm64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-linux-arm@0.13.8 (node_modules/esbuild/node_modules/esbuild-linux-arm):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-linux-arm@0.13.8: wanted {"os":"linux","arch":"arm"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-darwin-64@0.13.8 (node_modules/esbuild/node_modules/esbuild-darwin-64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-darwin-64@0.13.8: wanted {"os":"darwin","arch":"x64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-android-arm64@0.13.8 (node_modules/esbuild/node_modules/esbuild-android-arm64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-android-arm64@0.13.8: wanted {"os":"android","arch":"arm64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-linux-32@0.13.8 (node_modules/esbuild/node_modules/esbuild-linux-32):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-linux-32@0.13.8: wanted {"os":"linux","arch":"ia32"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-linux-mips64le@0.13.8 (node_modules/esbuild/node_modules/esbuild-linux-mips64le):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-linux-mips64le@0.13.8: wanted {"os":"linux","arch":"mips64el"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-darwin-arm64@0.13.8 (node_modules/esbuild/node_modules/esbuild-darwin-arm64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-darwin-arm64@0.13.8: wanted {"os":"darwin","arch":"arm64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-linux-arm64@0.13.8 (node_modules/esbuild/node_modules/esbuild-linux-arm64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-linux-arm64@0.13.8: wanted {"os":"linux","arch":"arm64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-linux-ppc64le@0.13.8 (node_modules/esbuild/node_modules/esbuild-linux-ppc64le):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-linux-ppc64le@0.13.8: wanted {"os":"linux","arch":"ppc64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-netbsd-64@0.13.8 (node_modules/esbuild/node_modules/esbuild-netbsd-64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-netbsd-64@0.13.8: wanted {"os":"netbsd","arch":"x64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-openbsd-64@0.13.8 (node_modules/esbuild/node_modules/esbuild-openbsd-64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-openbsd-64@0.13.8: wanted {"os":"openbsd","arch":"x64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-sunos-64@0.13.8 (node_modules/esbuild/node_modules/esbuild-sunos-64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-sunos-64@0.13.8: wanted {"os":"sunos","arch":"x64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-windows-32@0.13.8 (node_modules/esbuild/node_modules/esbuild-windows-32):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-windows-32@0.13.8: wanted {"os":"win32","arch":"ia32"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-windows-64@0.13.8 (node_modules/esbuild/node_modules/esbuild-windows-64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-windows-64@0.13.8: wanted {"os":"win32","arch":"x64"} (current: {"os":"linux","arch":"x64"})
npm WARN optional SKIPPING OPTIONAL DEPENDENCY: esbuild-windows-arm64@0.13.8 (node_modules/esbuild/node_modules/esbuild-windows-arm64):
npm WARN notsup SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for esbuild-windows-arm64@0.13.8: wanted {"os":"win32","arch":"arm64"} (current: {"os":"linux","arch":"x64"})
npm WARN enoent ENOENT: no such file or directory, open '/home/lisonge/project/test/package.json'
npm WARN test No description
npm WARN test No repository field.
npm WARN test No README data
npm WARN test No license field.

+ node-html-parser@5.0.0
+ typescript@4.4.4
+ esbuild@0.13.8
added 14 packages from 7 contributors and audited 30 packages in 2.238s

8 packages are looking for funding
  run `npm fund` for details

found 0 vulnerabilities

   ╭───────────────────────────────────────────────────────────────╮
   │                                                               │
   │      New major version of npm available! 6.14.15 → 8.1.0      │
   │   Changelog: https://github.com/npm/cli/releases/tag/v8.1.0   │
   │               Run npm install -g npm to update!               │
   │                                                               │
   ╰───────────────────────────────────────────────────────────────╯

  test.js  322.4kb

⚡ Done in 18ms
{ parse: undefined, HTMLElement: undefined }
❯ exit

Script done on 2021-10-20 10:27:04+08:00 [COMMAND_EXIT_CODE="0"]

test.js here https://gist.github.com/lisonge/bfb21198fa027fa285e7016c0672abbf#file-test-js-L5086

lisonge commented 3 years ago

It looks like you don't actually need to be using ESM. If you build to commonJS, you should be fine.

i try --platform=node, but same error

❯ npx esbuild test.ts --bundle --outfile=test.js --platform=node

  test.js  313.1kb

⚡ Done in 19ms
❯ node test.js
{ parse: undefined, HTMLElement: undefined }
nonara commented 3 years ago

Ok. I cloned your repo and had a look. A few observations:

  1. Something is going wrong with your build
  2. Looks like you don't need to use --platform=node

It works for me with both platform=node and otherwise, using your exact repository.

$ npx esbuild test/test_import.ts --bundle --outfile=dist/test_import.js

  dist\test_import.js  300.2kb

Done in 55ms

$ node dist/test_import.js
{
  parse: [Function: parse],
  HTMLElement: [class HTMLElement extends Node]
}

$ npx esbuild test/test_import.ts --bundle --outfile=dist/test_import.js --platform=node

  dist\test_import.js  310.9kb

Done in 115ms

$ node dist/test_import.js
{ parse: [Function: parse2], HTMLElement: [Function: HTMLElement3] }

I recommend the following:

  1. Uninstall your global version of esbuild (if one is installed)
  2. Ensure pnpm cache is cleared (global and local, if applicable)
  3. Delete pnpm-lock.yaml in your repo
  4. Make sure your package.json specifies the latest version of esbuild
  5. Run pnpm install
  6. Directly execute esbuild from node_modules (without npx) and check the output

If there is an issue beyond that, it's with esbuild. I've confirmed and it's pulling in the proper source code from the commonJS location, so this is not an issue with our ESM wrapper.

Good luck with it! I hope you get it sorted out.

lisonge commented 3 years ago

i try run shell code at github actions ubuntu-latest https://github.com/lisonge/test-cli/runs/3958824691?check_suite_focus=true#step:10:1 it still output

{ parse: undefined, HTMLElement: undefined }

can you show dist/test_import.js on your computer?

lisonge commented 3 years ago

@nonara hello, can you show dist/test_import.js on your computer to me?

nonara commented 3 years ago

Here's my file: https://gist.github.com/nonara/67ef1505a98a2c9ad02bf4ab06f0b183

I'd love to be able to help you track this out, but unfortunately I'm swamped. With that said, hopefully this will help point you in the right direction:

Given the discrepancy between your build failing and mine not, my best guess is that there is an issue between windows and unix.

Recommended steps:

Again, I've confirmed that this is not an issue with our esm wrapper, as we followed the proper convention, and it works in both commonjs and esm environments.

Good luck with it, and I hope you get it sorted out!

lisonge commented 3 years ago

I think the reason should be this

because https://github.com/taoqf/node-html-parser/blob/main/tsconfig.json#L24

"esModuleInterop": true,

generate https://cdn.jsdelivr.net/npm/node-html-parser@5.0.0/dist/index.js#L5

Object.defineProperty(exports, "__esModule", { value: true });

this line code mark current file is cjs that compiled from esm, it can help keep connection between es modules in cjs runtime

when esm import cjs in nodejs runtime, it doesn't depend on __esModule property, so it will use exports instead of exports['default']

but esbuild will depend on __esModule property , so it get exports['default']

and when i change Object.defineProperty(exports, "__esModule", { value: true }); to Object.defineProperty(exports, "__esModule", { value: false }); , the [code by esbuild pack] is work

for your this

Note that even though it works for me, the comment still says this: // ../node_modules/node-html-parser/dist/esm/nodes/node.js

I can't reproduce it at the moment, so i do not know

nonara commented 3 years ago

This is standard output. File an issue with esbuild.