highlightjs / highlight.js

JavaScript syntax highlighter with language auto-detection and zero dependencies.
https://highlightjs.org/
BSD 3-Clause "New" or "Revised" License
23.7k stars 3.6k forks source link

Discuss: Should Highlight.js version 11 npm release be ESM only? #2600

Closed aduh95 closed 3 years ago

aduh95 commented 4 years ago

Is your request related to a specific problem you're having?

I'm always frustrated when starting a project, I don't want to set up a whole build chain and just want to get started:

import highlight from './node_modules/highlight.js/lib/core.js'

And I get an error Error: 'default' is not exported by /node_modules/highlight.js/lib/core.js...

The solution you'd prefer / feature you'd like to see added...

I'd much rather prefer having standard ES6 modules in the npm package, and being able to use them as advertised in the README.md.

Any alternative solutions you considered...

s/module.exports =/export default/ on the node_modules/highlight.js/ files, but that's not very convenient.

Additional context...

Support for ESM has landed in Node.js current and LTS lines recently, with features such as Conditional Exports (useful in case we want to keep the CJS version around, still required for Node.js v10 support).

Having ESM files would also make the library compatible with Deno for free, which would be nice.

joshgoebel commented 4 years ago

Support for ESM has landed in Node.js current and LTS lines recently

Landed landed vs still experimental? Do you have a link/announcement?

Is there a graph or stats anywhere of who is still using Node v10 (it's popularity still)? Is it considered "very old"? We have very limited time - I'd really rather not maintain both CJS and ESM if we can avoid it. I'd much rather say drop CJS with the next major release and switch completely to ESM if it works for the majority, etc.

@egor-rogov @allejo Thoughts?


Also worth noting the internal libs are already ESM, but our npm packages (built for Node primarily) are packaged as CJS. If you started from just the GitHub repo source you'd already have ESM all the way down.

joshgoebel commented 4 years ago

@aduh95 Do you know if the built-in support can import from JSON files? We currently do this (via rollup) to pull in the version from the package.json file, though I suppose we could change that if we had to.

aduh95 commented 4 years ago

Do you have a link/announcement?

https://nodejs.org/en/blog/release/v12.17.0/

It used to be behind a flag and a warning, it's still officially experimental though. If you want to ditch CJS, it might be wiser to wait for the feature to be officially stable (the goal of Node.js is to have it stable by the time v14 reaches LTS in October).

Node.js v10 is in LTS maintenance mode until April 2021, so we can expect people still using it until its end of life. I don't think there are any official download stats for Node.js versions 😕

What is the timeframe for Highlight.js 11.x?

If you started from just the GitHub repo source you'd already have ESM all the way down.

That's a very good point, although it's worth noting that the source doesn't specify file extension in the imports (E.G.: import './file' instead of import './file.js'), which makes it not suitable for Node.js, Deno, or browser consumption without transformation. Maybe that something we can discuss refactoring?

Do you know if the built-in support can import from JSON files?

Importing JSON using ESM syntax is still experimental on current Node.js version. But there are workaround for this (example).

joshgoebel commented 4 years ago

What is the timeframe for Highlight.js 11.x?

Early 2021 prolly. I don't like to push breaking releases too often... so my idea is a roughly yearly cycle... but I'm also not in a rush if we don't NEED to break anything - 10 could live on longer.

it's still officially experimental though

Yeah, I'd say it's not a super high priority for us while it's still experimental.

it's worth noting that the source doesn't specify file extension in the imports (E.G.: import './file' instead of import './file.js')

If Rollup doesn't mind I'd have no issue with adding .js everywhere.

joshgoebel commented 4 years ago

Is anyone using .jsm? Or a special extension for modules vs normal CJS?

s/module.exports =/export default/ on the node_modules/highlight.js/ files, but that's not very convenient.

Perhaps I should have looked at this a bit more carefully. Is this really ALL you were looking for here? If we shipped those 2-3 build assets (or variants of) with ONLY this change... would that resolve this? I was imagining publishing all the "raw" source files as their own or alternative NPM package... which is a whole other level of complexity.

Just shipping ESM "entry points" for core and index sounds perhaps a lot more doable.

aduh95 commented 4 years ago

Is anyone using .jsm? Or a special extension for modules vs normal CJS?

.mjs is the one recognized by Node.js.

would that resolve this?

Yes, that would be totally fine! I think Rollup can handle that, right?

https://github.com/highlightjs/highlight.js/blob/b1bce6e3ada485b89696f554878b2ef44a73d94a/tools/build_config.js#L10

Haven't tried it, but I think you can replace it with:

output: [
  { format: "cjs", strict: false },
  { format: "esm" },
]

Would you interested in a PR?

joshgoebel commented 4 years ago

Actually that will mess with all the individual languages we require also, no? Those would then also need to also be modules, yes? That's what I was fearful of... I'm not sure we want to ship +180 additional/duplicate JS files just because of ESM... is that the only way to do this - or am I thinking about it wrong? I wonder if there are any other large modular libraries and how they are dealing with this problem?

Building a "monolith" (like we do for the browser) would solve some of the issues [for some people] but others really prefer to load just core and the languages they need one at a time, and it seems to allow that we'd still need 100s of separate modules.

joshgoebel commented 4 years ago

but I think you can replace it with:

I doubt it. We have a pretty custom build process. It'd probably be easier to build a list of the individual CJS files and then feed them thru a simple text transform... then spit out new files beside them.

But we'd also preferably want our CI/test suite to test these new "build products" and I'm not sure that's even even possible with Mocha. Do you know? Last time I looked at using modules with it things seemed pretty rocky... but perhaps that has improved?

joshgoebel commented 4 years ago

@aduh95 If your actual problem is truly just "get started fast" why not just use the browser build and change it's single CJS export to an ESM export instead?

aduh95 commented 4 years ago

Thanks for the detailed explanation, I think I got your point. This discussion would be more productive in a few months when Node.js ESM support has stabilised.

Maybe we can wait and see if the situation evolves on Node.js side, and come back to the "drop CJS with the next major release and switch completely to ESM" mindset when it makes sense?

But we'd also preferably want our CI/test suite to test these new "build products" and I'm not sure that's even even possible with Mocha. Do you know?

Yes Mocha is able to interact with ES modules, you just have import() them instead on require() them.

If your actual problem is truly just "get started fast" why not just use the browser build and change it's single CJS export to an ESM export instead?

I want the best of both world: get started fast and cherry-pick the languages I'm actually highlighting. I know I'm bad ^^

joshgoebel commented 4 years ago

This discussion would be more productive in a few months when Node.js ESM support has stabilised.

So be it.

joshgoebel commented 3 years ago

This discussion would be more productive in a few months when Node.js ESM support has stabilised.

How are things looking now? I'm curious if other packages (with lots of individual files) are shipping BOTH ESM and CJS versions via npm and if so how they are doing it...

I'm thinking of rolling v11 perhaps in April/May and seems a good time to revisit this...

aduh95 commented 3 years ago

Hey! Yes, there are a few packages that have started doing that, on the top of my head I can think of Rollup and Preact, I know there are more. I'm not sure they qualify for the lot of individual files part though.

On Node.js side, ES modules support is no longer experimental (https://github.com/nodejs/node/pull/35781) and is considered stable. Node.js v10.x is planned to go end-of-life at the end of April 2021, which would mean all maintained Node.js release lines (v12.x, v14.x, v16.x) would have support for ES modules.

IMHO, considering the time frame, I think we should go all-in with ESM and ditch CJS in the npm package:

I can work on a PR to start experimenting with that if that sounds like a plan to you.

joshgoebel commented 3 years ago

A user working with CJS modules can still use highlight.js using import('highligh.js').then().

Does this also then work perfectly in v12? Did the ESM support become non-experimental in the latest releases of v12 or do you still need to trigger a special runtime flag?

Users working on web apps could start using highlight.js without a bundler/transpiler.

You mean vs using our pre-built and minified distributable assets? (which we'll still create for users who need/prefer them) I'm all for giving people choices, but are there big advantages using ESM if the library isn't built with that in mind? Someone crazy enough to import everything would be loading/fetching over 180 distinct JS files. It's possible I don't understand the full ESM in-browser story yet since I've always bundled on every web app I've worked on.

That wouldn't make the npm package twice as heavy (which would inevitably happen if we were to ship both CJS & ESM).

I'm not super worried about the size, thought it is a consideration. More worried about maintaining dual build, and CI pipelines... but the end result is the same: A single final product be nicer than two. :)

I can work on a PR to start experimenting with that if that sounds like a plan to you.

Is there a reason the npm build process couldn't mostly do a cp **/*.js -> into proper places since we're already using ESM modules for our raw source? I also need to take another look at the mocha story for ESM now...

aduh95 commented 3 years ago

Does this also then work perfectly in v12? Did the ESM support become non-experimental in the latest releases of v12 or do you still need to trigger a special runtime flag?

Yes, v12.20.0 has the same implementation as current v15.x (with the exception of top-level await, but that's because it's using an older version of V8), no flag is needed, and the experimental warning has been removed.

Someone crazy enough to import everything would be loading/fetching over 180 distinct JS files.

In some projects of mine I was using highlight.js to highlight one specific language, so that would be only two files to fetch, correct? Otherwise, I agree you should use a bundler or the pre-built file if you want to include everything.

Is there a reason the npm build process couldn't mostly do a cp */.js -> into proper places since we're already using ESM modules for our raw source?

Yes probably, since a010c151328b8944a7a03e5335956de8baf63295 that should be possible. There some documentation and update to make to package.json (we need to add "type": "module" so Node.js and bundler know we're shipping ESM), but yeah, hopefully it should be pretty straightforward :)

joshgoebel commented 3 years ago

Looks like the modules themselves almost work as-is (except for our importing of package.json) with a small test utility I use a lot... Too bad this seems to require --experimental-json-modules https://stackoverflow.com/questions/59738999/does-the-es-modules-implementation-support-importing-json-files Is there some nice solution that I'm missing of if I want to magically import the version number from package.json into the library?

In some projects of mine I was using highlight.js to highlight one specific language, so that would be only two files to fetch, correct?

Well our raw source tree is broken up into separate files for organization. Just the library + one languages is maybe 7-8 different small files. Perhaps (for API privacy concerns as well) we still need to build cooked ESM assets for:

Thoughts?

aduh95 commented 3 years ago

Is there some nice solution that I'm missing of if I want to magically import the version number from package.json into the library?

On Node.js, you can do that:

import { readFile } from 'fs/promises';

const { version } = await readFile(new URL('../package.json', import.meta.url)).then(JSON.parse);
console.log(version); // Outputs '10.6.0'.

Doing so would not work on the browser, so that's probably not the way to go for the distributed package…

Given that plus the fact that the source files are broken up into several files, it seems the way forward would be to update the build_node script to output ESM, and keeping the current distributed file organization (except it would contain ESM instead of CJS).

aduh95 commented 3 years ago

Too bad this seems to require --experimental-json-modules

FYI it's expected to stop being necessary soon: https://github.com/nodejs/node/issues/37141.

joshgoebel commented 3 years ago

@aduh95 If you wanted to help further you could take a look at that PR and provide any thoughts. Also if you wanted to muck around with getting the test suite fully operational again that'd be awesome. Best case I'd guess it's changing a bunch of requires to imports, but I'm not really sure?

Looks pretty clean so far.

joshgoebel commented 3 years ago

Some addl discussion here (so it's not lost): https://github.com/joshgoebel/highlight.js/pull/2

joshgoebel commented 3 years ago

[Sorry for jumping around, trying to keep high-level notes on this thread]

we could decide that we supply only a bundle CJS file

Oh now I hadn't really considered the pros and cons of that... but that makes CJS a second class citizen, so that seems like a no go.

provide a CJS module for each language, but I thought you said that was not acceptable?

Did I say that, or did I say it was annoying? Either way, I'm learning as we go here and my thoughts are evolving.

I believe going ESM only at this point in time could be disruptive to users with CJS codebases [who cannot easily upgrade and] would therefore be "locked out" of the latest version... nor do I see immediate benefits worth justifying such a disruption. Our release strategy is that there is no reason users should NOT be on the latest release. If you have a bug or glitch, upgrade, period. Our grammars are constantly evolving and hence the ability to stay current is important. Leaving all the CJS people behind on version 10 (which seems to be the plan of many other JS libraries) is not really a good strategy.

So that means either we delay ESM to some point in the future... or that we support both. Hence my curiosity at exploring what supporting both might look like. But also if we're supporting both then I'm not sure this needs to be a breaking change at all and then it wouldn't be tied so tightly to our v11 release, but could be done whenever.

I'll post something on twitter and see if perhaps I can get a broader idea of what our Node.js usage looks like.

aduh95 commented 3 years ago

For completeness sake, I feel the need to clarify that moving to ESM only doesn't "lock out" CJS users to v10, they can still bundle themselves or use dynamic import – if their API is already async, or if they already bundle their code and dependencies, that would be a transparent change to them. But I agree dual package mode is certainly the most comfortable for everyone.

joshgoebel commented 3 years ago

Yes, I understand - that's what I was covering with "who cannot easily upgrade"... if they are not bundling and can't use dynamic import then they would be "locked out". I don't know why most using Node.js on the server-side would be bundling at all... I think it's far more common to just require what you need directly. So to me the real question is how many people we have on Node.js who aren't yet on ESM codebases and also can't use dynamic imports - which is what I asked on twitter.

aduh95 commented 3 years ago

For reference, the tweet is at https://twitter.com/highlightjs/status/1374069451444908034.

joshgoebel commented 3 years ago

Almost all this stuff (conditional exports, subpath exports, subpath imports, etc.) still seems to be flagged experimental in v12: https://nodejs.org/docs/latest-v12.x/api/packages.html

That's a bit concerning.

aduh95 commented 3 years ago

v12.20.0 will remove the experimental status: https://github.com/nodejs/node/pull/37797 Its expected release date is March 30th.

joshgoebel commented 3 years ago

Ah sorry, I forget we're still ahead of the curve. :-)

joshgoebel commented 3 years ago

Do you have any thoughts on dir placement? I really dislike having both a "cjs" and "esm" folder as I've seen someplaces... so to me that means either a "esm" or "cjs" folder (in addition to lib). Which one is lib vs named would likely depend on whether the core package ships as module or commonjs... if we bundle both I'm not sure what the advantages/disadvantages are of shipping module vs commonjs. I assume commonjs would be more compatible with older versions of Node.

Really I'm wondering where the cjs or esm folder goes though? Does it go inside lib or does it reside at the top level?

- /
  - /lib
  - /compat 

vs

- /
  - /lib
  - /lib/compat 
joshgoebel commented 3 years ago

Question. If we stay type: "commonjs" and add BOTH ESM and CJS via conditional exports... that would mean this is no longer a breaking change (for anyone), wouldn't it?

aduh95 commented 3 years ago

It would still be a breaking change for Node.js v10.x users, as it does not have support for "exports" field.

joshgoebel commented 3 years ago

Well, I don't mind dropping v10... BUT just for curiosity, why would it be a breaking change? If the CJS files were still in all the same places... wouldn't they continue to just work as they always have? I'd assume exports would just merely be ignored and not used?

aduh95 commented 3 years ago

I guess you're right, if we keep the same file structure for CJS module and add exports conditional exports to rewrite ESM resolution and opt-out of package encapsulation, I don't see why it would be breaking.

joshgoebel commented 3 years ago

And what would the downsides of such an approach be?

aduh95 commented 3 years ago

I don't think there is any downside, in the OP I was suggesting adding ESM as a non-breaking change using conditional exports so it seems we've just circled back :)

joshgoebel commented 3 years ago

Sometimes we take the long way around. :-) I'll take another swing at this in a bit with this goal in mind.

joshgoebel commented 3 years ago

Can you have a look at https://github.com/highlightjs/highlight.js/pull/3007 now?

wooorm commented 3 years ago

@joshgoebel FYI: I’ve started changing the modules that I maintain to ESM-only. Starting from the low-level projects, and reaching lowlight somewhere next month probably. Let me know if I can advise on something

joshgoebel commented 3 years ago

@wooorm ESM-only sounded too ambitious to me so I think we're likely to ship a dual library with conditional exports. If you know anything about that and wanted to take a glance at #3007...

wooorm commented 3 years ago

@joshgoebel I tried that with micromark in November and users ran into a ton of problems with tools, such as webpack 4 (still used in CRA and Next): https://github.com/remarkjs/react-markdown/issues/518. I’m not optimistic that you can push dual esm/cjs in a minor release. A way around it would be if you’d eliminate default exports. Maybe there’s other ways too

joshgoebel commented 3 years ago

How did you try it though? I was planning to leave the core library/package.json still CJS (type: commonjs)... we're adding ESM to the es folder... so any bundler who "wasn't aware" of the "new" things would just see the CJS and work as it always had - or that's my thinking. Now if a packager is mis-handling the new standards - then I'd say that's kind of a packager issue?

Looks like you said Webpack fixed their issues in version 5?

And we're gearing up for v11, so it doesn't have to be a minor release, I just thought if we could divorce it from v11 that'd be one less thing, but if it can't be, that's fine also.

wooorm commented 3 years ago

https://github.com/micromark/micromark/pull/36 is the main chunk of the work.

so any bundler who "wasn't aware" of the "new" things would just see the CJS and work as it always had - or that's my thinking.

I had the same idea. It turned out to be incorrect. Bundlers and other tools have been running faux-ESM for so long, that they never bothered to make them spec compliant / work.

Now if a packager is mis-handling the new standards - then I'd say that's kind of a packager issue?

Agreed.

Looks like you said Webpack fixed their issues in version 5?

Yep. But lots of users are still on broken tools.

joshgoebel commented 3 years ago

Bundlers and other tools have been running faux-ESM for so long

So if someone was literally using our canonical requires:

const hljs = require('highlight.js');
// ... or ...
const hljs = require('highlight.js/lib/core'); 

Would those still break even assuming index.js and core.js had 0 actual changes? Or is the breakage for people already using faux-ESM with bundlers and switching to REAL ESM?

wooorm commented 3 years ago

I believe that’s indeed the issue. Webpack 4 will “magically” start using ESM in your above example, even thought it’s clearly CJS expecting CJS, and it doesn’t understand that actual ESM 🤷‍♂️ (edit: to clarify, something in between your 2 cases) You might get around it by adding an __esModule: true and a .default: theDefaultExport field, but I’m not 100% sure off the top of my head. You should test that 😅

joshgoebel commented 3 years ago

How can you still support JS requires with file extensions using conditional exports subpath patterns?

const swift = require('highlight.js/lib/languages/swift.js')
Error: Cannot find module '/Users/jgoebel/work/tmp_app/node_modules/highlight.js/lib/languages/swift.js.js'

It always wants to re-add the extension, even if it's already there. And I think using/requiring extensions is a very common pattern, there is even a linter setting to require them.

wooorm commented 3 years ago

That depends on the exports? I haven’t used it really, but here’s the docs: https://nodejs.org/api/packages.html#packages_subpath_exports.

aduh95 commented 3 years ago

I think that would be a breaking change indeed. We could hack our way around it by adding a swift.js.js that contains

function emitWarning() {
  if (!emitWarning.warned) {
    emitWarning.warned = true;
    process.emitWarning(
      'Using file extension in specifier is deprecated, use "highlight.js/lib/languages/swift" instead'
    );
  }
}
emitWarning();
exports = require('./swift.js');
joshgoebel commented 3 years ago

That depends on the exports? I haven’t used it really, but here’s the docs:

I've read them. They don't seem to address this that I see other than acknowledging the problem at one point. I was posting it here hoping someone knows the solution. Optimally there would be a way to make it "just work" extension or not - without having to have a manual list of 180 languages in exports.

joshgoebel commented 3 years ago

We could hack our way around it by adding a swift.js.js that contains

Ha, indeed. Before we went there though I think I'd just bloat the package.json with 180 entries for every language and give up on *.

aduh95 commented 3 years ago

I don't think that fit your use-case, but I thought I should mention it anyway: you could also change package.json to have limited support for extension paths:

  "exports": {
    "./lib/languages/*": { "require": ["./lib/languages/*.js", "./lib/languages/*"], ... }
  },

That would still fail for Node.js (see https://github.com/nodejs/node/issues/37928#issuecomment-808833604), but would work for some bundlers – I know that works in Webpack 5, not sure for others. Depending on how breaking you are wanting to go with this, it may be an acceptable compromise.

joshgoebel commented 3 years ago
"./lib/languages/*": { "require": ["./lib/languages/*.js", "./lib/languages/*"], ... }

What does having require be an array do??? I haven't seen that.

Yes the "exports" field was designed for package encapsulation first, for package authors who want to be picky regarding what is exposed to their users.

I wonder if we shouldn't just forget exports then - we only publish things we want to expose in any case. I just thought it was nice that it made ESM "just work" with the same paths... (loading the ESM modules vs CJS)... if we remove exports then ESM users either have to change "lib" to "es"... or they'll just be importing the existing CJS modules, which actually works just fine if the environment is Node.