eemeli / yaml

YAML parser and stringifier for JavaScript
https://eemeli.org/yaml
ISC License
1.32k stars 116 forks source link

feat: provide explicit ESM pkg entry points #560

Open antongolub opened 5 months ago

antongolub commented 5 months ago

Despite that the cjs format is widely used in Nodejs, there are many reasons to use ESM today. Let's add the corresponding entry points.

  "exports": {
    ".": {
      "types": "./dist/index.d.ts",
      "import": "./browser/index.js",
      "node": "./dist/index.js",
      "default": "./browser/index.js"
    },
antongolub commented 5 months ago

My fault, I was a bit hasty, trying to solve a couple of problems at once. Our practical cases:

1) We have subsystems on node.js, in which using of the require is prohibited due to ISEC policies (to avoid manipulation of module contents). 2) We apply esbuild for bundling, and we'd like to pick up the appropriate module versions when format set esm. As you rightly noted, the node directive has priority, so commonjs chunks appear instead.

eemeli commented 5 months ago

We have subsystems on node.js, in which using of the require is prohibited due to ISEC policies (to avoid manipulation of module contents).

Could you provide some links explaining how ESM is better than CJS for this? Searching for "ISEC policies" isn't providing anything useful for me, and I'm rather unconvinced about the differences being exploitable as an actually viable attack vector.

We apply esbuild for bundling, and we'd like to pick up the appropriate module versions when format set esm. As you rightly noted, the node directive has priority, so commonjs chunks appear instead.

What's the downside in this case? Presumably you're compiling with --platform=node, and CJS should work just as well?

antongolub commented 5 months ago

EMS format brings a portion immutability: you cannot just override module.exports entry or require.cache as for commonjs. This makes whole module mocking a problem, but increases security.

Our case is quite exotic: we want to resolve a deps tree as esm, but to convert it to cjs at the final step. This will significantly reduce the size of the bundle.

eemeli commented 5 months ago

The increase in security is wholly illusory. Consider the following, for instance:

// patch.mjs
import { Document } from 'yaml'
Document.prototype.toJS = () => 'foo'
// index.mjs
import './patch.mjs'
import { parse } from 'yaml'
parse('one: two') // returns 'foo'

If an attacker is able to execute code in your runtime environment, the above will work completely independently of the CJS/ESM packaging of this library. This is not a scenario for which it's reasonable to build protections; if an attacker gets that far, you've already lost the game. To protect against this, you should have wholly separate environments where you're running user & system code.

Our case is quite exotic: we want to resolve a deps tree as esm, but to convert it to cjs at the final step. This will significantly reduce the size of the bundle.

Where is that bundle size saving coming from, for something like yaml as a dependency?

antongolub commented 5 months ago

The increase in security is wholly illusory

Well, immutable references in the module cache are just the first step, but more are needed. warmup + deep Object.freeze might fit your snippet.

I'm not suggesting to abandon cjs, but just add another entry point.

Correct. We have a need to make static builds for some utilities. I'm not suggesting to abandon cjs, but just add another entry point.

eemeli commented 5 months ago

Well, immutable references in the module cache are just the first step, but more are needed. warmup + deep Object.freeze might fit your snippet.

Given that I'm not willing to deep-freeze all objects in yaml, if you want to do that as a user, how does the module system help you at all if you're already intercepting imports/requires or operating them through a whitelist of some sort?

This security policy sounds like a thing that you can easily measure, but which has no real-world impact.

I'm not suggesting to abandon cjs, but just add another entry point.

Adding separate import & require entry points for Node.js would probably need to be done in a new major version, given that it would break the following identity:

// barrel.cjs
module.exports = require('yaml')
// index.mjs
import { Document as ESMDocument } from 'yaml'
import { Document as CJSDocument } from './barrel.cjs'
ESMDocument === CJSDocument

So while adding an entrypoint sounds like a small change, it's still a breaking change. Which, to be clear, I'm quite willing to consider, but it needs to have really good reasons for it. That's why I'm asking for practical, real-world examples of cases where the current config can be shown to be suboptimal in some way.

antongolub commented 5 months ago

Reasonable. But module refs obtained via require and import are alway different, afair:

var y1 = require('yaml'), y2 = await import('yaml'); console.log(y1 === y2)

We can avoid any hypothetical br change by adding smth like:

{
   "export": {
      "esm": {
         "types": "./dist/index.d.ts",
         "default": "./dist/browser/index.js"
      },
      // ... rest
   }
}
eemeli commented 5 months ago

Reasonable. But module refs obtained via require and import are alway different, afair:

var y1 = require('yaml'), y2 = await import('yaml'); console.log(y1 === y2)

The top-level module, yes, but not the contents. Try this:

var y1 = require('yaml');
import('yaml').then(y2 => {
  console.log(y1.parse === y2.parse);
});
antongolub commented 5 months ago

Same trick with a separated esm entrypoint: https://github.com/jprichardson/node-fs-extra/blob/master/package.json#L57

eemeli commented 5 months ago

Ah, I'd missed that you'd changed your proposal to defining yaml/esm and, presumably, yaml/esm/util. That would indeed avoid making this a breaking change, but I'm still not at all convinced this is worthwhile given that importing from yaml and yaml/util already works just fine.

I don't buy the argument that ESM is more secure than CJS for this library, and I'm still waiting for a reply to this question from earlier:

Our case is quite exotic: we want to resolve a deps tree as esm, but to convert it to cjs at the final step. This will significantly reduce the size of the bundle.

Where is that bundle size saving coming from, for something like yaml as a dependency?

Tbh, I'm tempted to wait for Node's require(esm) support to mature enough to allow dropping the CJS build completely.

antongolub commented 5 months ago

I don't claim to have the truth. But we find the combination sufficient: immutable module cache plus frozen immutable modules. I want to believe that Object.freeze / {configurable: false, writable: false} effects cannot be reverted.