stackblitz / webcontainer-core

Dev environments. In your web app.
https://webcontainers.io
MIT License
3.84k stars 152 forks source link

RFC: Env flag for npm packages to conditionally load WASM binaries instead of native binaries #286

Open EricSimons opened 3 years ago

EricSimons commented 3 years ago

UPDATE

This proposal has changed and instead relies on import/export maps, and has officially landed in Node.js core.

I have kept the original (now outdated) proposal below for future reference:

view original (outdated) proposal ## The problem Web development toolchains are increasingly including binaries for compilers/bundlers as they provide substantially faster performance than JS equivalents. The ecosystem is also increasingly adopting WebAssembly for distributing & executing binaries as it provides far greater security guarantees, are directly inspectable, and "build once and run everywhere". Today `esbuild` and `swc` are the most popular of these new tools, and both release a WASM version in a separate package in lockstep with their native binaries (`esbuild-wasm` and `@swc/wasm`). As such, consumers of these packages need to determine at runtime whether they should use the native binaries or the WASM binary. Parcel v2 [currently uses a custom flag](https://unpkg.com/browse/@parcel/transformer-js@2.0.0-rc.0/native.js) `process.env.PARCEL_SWC_WASM` that indicates it should require the WASM binary and not the platform specific variant. Other toolchains that have adopted swc/esbuild (Vite, Next.js, Snowpack, etc) need a similar way of determining this as well, and custom flags for each package is not ideal. ## Proposed Solution: `process.env.REQUIRES_WASM` I propose we standardize on a new flag, `process.env.REQUIRES_WASM`, that instructs npm packages to use WASM binaries instead of unsandboxed platform specific binaries. This is important for environments that require high security guarantees (i.e. targets of supply chain attacks like enterprises and OSS devs), WebAssembly based containers like [WebContainer](https://blog.stackblitz.com/posts/introducing-webcontainers/) or kubelet, in-browser playgrounds, and other cases where native binaries cannot/should not be executed. ### Indicates WASM is _required_, not just preferred. Importantly, this flag does _not_ indicate that the host environment prefers WASM. Instead it is an instruction that the host _requires_ WASM binaries and cannot (or does not) allow execution of unsandboxed native binaries. ### Intended functionality If `process.env.REQUIRES_WASM` is true at runtime, npm packages should import WASM binaries (or a JS file generated via emscripten, etc) instead of corresponding platform specific binaries. Should a package not have a WASM binary available (and/or can't polyfill the functionality otherwise), the program should throw an error if `process.env.REQUIRES_WASM` is present.
sokra commented 3 years ago

I would propose to use an exports/imports field condition instead of a (runtime) environment variable.

An exports field condition has the advantage that it's statically known instead of at runtime. This allows to avoid using import() and instead import ... from "..." can be used.

see also

matthewp commented 3 years ago

Great idea!

Another angle to this problem is: many of these tools rely on the postinstall hook of npm. One common problem they encounter is that if you run npm install --ignore-scripts, they postinstall hook won't be called and the native binaries won't be installed. Some times users have this on without knowing it (such as in an .npmrc) and file issues about it.

If we could convince npm to make this option the default it would push these types of tools to prioritize wasm and make it the primary way to use their tools.

evanw commented 3 years ago

(I'm commenting here because I was pinged on Twitter as the author of esbuild, but I don't have any stake in this PR)

If process.env.REQUIRES_WASM is true at runtime, npm packages should import WASM binaries (or a JS file generated via emscripten, etc) instead of corresponding platform specific binaries.

Each binary executable is many megabytes. Solutions which require shipping binary executables for multiple platforms in the same package are very undesirable because then the weight of the package (installation time and disk size) would be huge. The reason esbuild's WebAssembly package is separate from the native package is to reduce the weight of the package. Having that be a run-time switch instead of an install-time switch would increase the weight of the package by a lot, and is likely not something I would ever consider doing for esbuild.

If we could convince npm to make this option the default it would push these types of tools to prioritize wasm and make it the primary way to use their tools.

This isn't ever going to happen for esbuild. WASM has a number of major drawbacks and I would never want it to be the primary way to interact with esbuild. If there were no drawbacks, there wouldn't be a reason to ship native executables. Some of the drawbacks:

In esbuild's case WASM can easily be 10x slower than native. WASM will never be the default implementation.

Another angle to this problem is: many of these tools rely on the postinstall hook of npm. One common problem they encounter is that if you run npm install --ignore-scripts, they postinstall hook won't be called and the native binaries won't be installed. Some times users have this on without knowing it (such as in an .npmrc) and file issues about it.

I'm currently experimenting with an alternative installation strategy for esbuild that uses optionalDependencies instead of a postinstall script. If it works, that would solve this particular problem without needing any new environment flags and without relying on WASM. See this post for more information: https://github.com/evanw/esbuild/issues/789#issuecomment-890596192. The only drawback is that optionalDependencies doesn't provide a way to have a default "fallback" dependency if none of the optional dependencies apply. There's another proposal from Yarn that attempts to address that need: https://github.com/yarnpkg/berry/issues/2751.

matthewp commented 3 years ago

Thanks @evanw, the optionalDependencies trick sounds interesting. Am I reading this right that it would install the wasm version for webcontainer (which needs to run in the browser)?

matthewp commented 3 years ago

Another idea, probably for the Node.js security team is, if there was a way to disable 3rd party packages from launching subprocesses that would help discourage the install-binary patterns. Deno has this with --allow-run (this applies to all code, not just 3rd party), and the result is that you see mostly wasm used there.

matthewp commented 3 years ago

@EricSimons Given the discussion here and elsewhere, I think standardizing on an export map condition could be the best immediate-impact solution here. stackblitz could run install with --condition=requires_wasm (bikeshed on name) and everything should just work.

EricSimons commented 3 years ago

@sokra @matthewp I think export maps definitely could be a good solution here. It does diverge from how folks are currently handling this issue atm, as both Next and Parcel load the appropriate binary using runtime conditions. I'd be curious to see if these teams would have any reservations on this approach (I imagine Next would be on board as Tobias works on it, but also cc'ing @devongovett for the Parcel team's take).

evanw commented 3 years ago

Thanks @evanw, the optionalDependencies trick sounds interesting. Am I reading this right that it would install the wasm version for webcontainer (which needs to run in the browser)?

No, it has nothing to do with WASM. I just mentioned it since it's a way of avoiding problems with --ignore-scripts and postinstall scripts. If you need to run esbuild in the browser, then you'll need to explicitly install the WASM version of esbuild instead of the native version.

EricSimons commented 3 years ago

@evanw thanks for taking the time to respond here- some extra perspective/info that might be relevant:

Solutions which require shipping binary executables for multiple platforms in the same package are very undesirable because then the weight of the package (installation time and disk size) would be huge

AFAIK many npm packages are starting to ship included binaries because it removes the CPU time (and often errors) that happen on install and during CI jobs. It also eliminates the need for postinstall, which is ideal from a security POV. I do understand why you don't want to go this route for esbuild though.

Depending on the scenario, WASM's 2gb address size limit can be prohibitive and lead to crashes or large slowdowns as available memory gets low. Native binaries do not have this restriction and generally have a better chance of working correctly.

FYI that this is now 4gb as of May 2020, and wasm64 just hit stage 3 in June which will remove this limit. v8 has already begun work on implementation.

In esbuild's case WASM can easily be 10x slower than native. WASM will never be the default implementation.

As you pointed out in a previous issue, this is in part because esbuild-wasm can currently only run single threaded whereas the native variant has full multithreading. WebAssembly Threads are now available in Chrome/Edge/Firefox and enables high performance multithreading. We use this for WebContainers and the performance has been able to exceed local execution speeds. This might be worth exploring for esbuild-wasm.

Big fan of esbuild and would love to see/help it adopt the latest improvements for the webassembly version!

devongovett commented 3 years ago

Isn't this what the "browser" field in package.json is for? It's widely supported by pretty much every bundler, and allows substituting the whole module or specific files for the browser. We already use it for some of our native packages, and I think that was the plan for the rest as well. The environment variable we currently use for other packages is just a temporary hack I believe.

EricSimons commented 3 years ago

@devongovett for bundling use cases definitely, but for environments that provide full Node runtimes requiring strong security guarantees (i.e. targets of supply chain attacks like enterprises and OSS devs, WebAssembly based containers like WebContainer or kubelet, etc) having the browser field affect Node's real resolution algorithm could be problematic for a few reasons:

So IMO given the advent and growth of WASM adoption in npm packages for local envs, It seems there's a bit of a middle ground we need to accomodate that the browser doesn't seem well suited for.

devongovett commented 3 years ago

Hmm interesting. Unfortunately, I think WASM has a couple drawbacks over native packages, which is why we distribute native binaries by default.

For those reasons, I don't think WASM would be the default way to use Parcel, but perhaps could be opt-in. The first issue is fine if you accept the perf hit, but the second I'm not sure how to solve.

EricSimons commented 3 years ago

I think opt-in would definitely be the right way to handle this. It seems like there's growing consensus that @sokra's idea of standardizing on an export map condition is the right path for a standardized opt-in, so if that sounds sensible I think I'm going to modify the original issue to reflect an export map condition proposal. Also happy to continue discussing alternative ideas otherwise though 👍

devongovett commented 3 years ago

Where do export conditions get standardized? In Node core? Looks like node only supports "import", "require", "node", and "default". One of the reasons Parcel has not yet implemented package exports support is that it's unclear to us how additional conditions get standardized so that they actually work the same way across tools.

IgorMinar commented 3 years ago

+1 to using standardized conditional exports, but that is just a part of the solution.

It would also be great to address the package payload size issue that @evanw brought up. We've struggled with this a bit and ended up using optionalDependencies as the workaround.

Ideally, we could codify/standardize this "non-js extension" part of the package system as well (and update clients not to waste bandwidth, and create noise in the terminal) by honoring the os value specified in package.json (or consider moving the os specifier info to be part of parent's package.json's optionalDependencies, or specified via a new package.json field like platformDependencies (and there are likely other solutions to consider).

The last missing bit is to standardize a wasm-specific os specifier in package.json, for example "os": "wasm".

This combo would ensure wide-applicability of the design (including for wasm environments), while solve the current hurdles that npm packages with native/non-js extensions face.

So to recap, an ideal solution would not only ensure that the (preferably static) module resolution occurs in a platform-aware way but that also the package installation is performed in platform-aware way to improve the efficiency and DX of the package delivery system.

matthewp commented 3 years ago

@devongovett Node.js wouldn't be involved in standardizing an export condition unless it planned on using it itself. I don't think this is one they would use. I think this would be a community standard, which like browser is just an informal agreement among tool makers.

matthewp commented 3 years ago

Actually I said that and then @bmeck said they might be interested: https://twitter.com/bradleymeck/status/1423375532503207942. Bradley, is nodejs/node the right place to propose a recognized condition?

bmeck commented 3 years ago

yes nodejs/node is the right place, just make an issue and/or PR. This seems like it wouldn't be too controversial except getting the right names for condition and how to enable the mode (flag vs config file vs API)

d3lm commented 3 years ago

I personally agree that using exports and imports can solve this in an elegant yet flexible way. Ideally there's a standard condition baked into Node.js that we can use for this which would also disable dlopen as @bmeck mentioned already on Twitter.

More specifically we could try to standardize a --no-addons or --no-native-addons (similar to --no-warnings or --no-force-async-hooks-checks) which disables process.dlopen and enables a no-addons or no-native-addons condition.

Open source maintainers could then provide a different entry point given the no-addons condition:

{
  "exports": {
    ".": {
      "no-addons": "./wasm-entry-point.js",
      "default": "./native-entry-point.js"
    }
  }
}

This would also address the performance concerns in a neat way because by default it uses the "native entry point" and only falls back to the WASM version if the condition is provided, either directly -C=no-addons or through a CLI option --no-addons which does a bit more than just providing a condition.

d3lm commented 3 years ago

It thought a bit more about this and wanted to propose another option which is a combination of an imports mapping and optionalDependencies.

If a package depends for example on esbuild then a maintainer could conditionally use the WASM versions with an imports mapping:

{
  "imports": {
    "#esbuild": {
      "no-addons": "esbuild-wasm",
      "default": "esbuild"
    }
  }
}

The library could then import #esbuild and it conditionally imports either the WASM or native package. This requires esbuild-wasm to be specified in optionalDependencies. The downside here is that the initial install gets a little slower because all optional dependencies will be installed.

arcanis commented 3 years ago

You might be interested in the following RFC for Yarn: https://github.com/yarnpkg/berry/issues/2751. Each approach attempts to solve different goals, but some of them seem shared.

padmaia commented 3 years ago

I like the idea of this issue being solved at the package manager level to avoid downloading unnecessary binaries. I'm still digesting the RFC that @arcanis shared, but will comment there if I have any thoughts.

d3lm commented 3 years ago

I agree that if this is handled on the package manager level that'd be ideal so we don't have to download packages which won't be used. I think this with a condition makes up a good solution which should cover most use cases.

Jamesernator commented 3 years ago

I agree that if this is handled on the package manager level that'd be ideal so we don't have to download packages which won't be used. I think this with a condition makes up a good solution which should cover most use cases.

One option would be for Node to allow bare specifiers in "exports", at current you can only refer to files in a package but if you could refer to bare specifiers you could refer to other packages i.e.:

// package.json
{
    "name": "esbuild",
    "exports": {
        "node": {
            "no-addons": "esbuild-wasm"
        },
        "browser": "esbuild-wasm",
        "default": "esbuild-node-native"
    },
    "peerExportedDependencies": {
        "esbuild": "99.x",
        "esbuild-wasm": "99.x"
    }
}

This would require npm to learn about --conditions like Node has though and add some way to specify versions (which I've done as "peerExportedDependencies" in this example).

Jamesernator commented 3 years ago

An alternative that might be simpler for package managers and still ensures package-lock.json is consistent across environments would be to have "peer dependency groups" i.e. give a list of options, and I need at least one.

In this sort've scheme the onus of specifying the multiple versions is on direct consumers of esbuild not on esbuild itself. i.e. something like snowpack would add into their package.json something like this:

// snowpack/package.json
{
    "peerDependencies": {
        "esbuild": "99.x",
        "esbuild-wasm": "99.x"
    },
    "peerDependenciesMeta": {
        "esbuild": { "optional": true },
        "esbuild-wasm": { "optional": true }
    },
    "peerDependencyGroups": [
        // Declare that at least one of these is required, if none is given
        // then npm would just download the first by default
        { "packages": ["esbuild", "esbuild-wasm"], "required": true }
    ]
}

The one thing is in environments like StackBlitz someone using snowpack would need to add "esbuild-wasm" as a peer dependency, but that might be doable automatically by reading the package.json for certain packages.

EDIT: I've opened an npm rfc for this idea