yarnpkg / berry

📦🐈 Active development trunk for Yarn ⚒
https://yarnpkg.com
BSD 2-Clause "Simplified" License
7.46k stars 1.12k forks source link

[Bug?]: PNP with ESM using Lambda Layers is broken with AWS SAM #6604

Open Downchuck opened 3 days ago

Downchuck commented 3 days ago

Self-service

Describe the bug

This is very specific to AWS SAM images - there may be some yarn bugs its uncovering, but to date this only comes up in a SAM image. As far as I can tell, AWS has expressed negative interest in Yarn PNP and further they have their own /var/runtime/index.mjs wrapper which complicates things.

While the .pnp.cjs file works well in AWS SAM, the ESM loader does not seem to work as desired when the loader or dependencies are outside of the project folder. Configuration variables such as the yarn cache folder are not used and it uses the current cwd, making it difficult to point to a .cjs file (which the .mjs loader uses).

Here is the fix I'm using - as the dependencies are in /opt/nodejs/ -e-g- ".yarn" for the cache folder and family, and ".pnp.cjs" for the cjs file which is used by the .pnp.loader.mjs file.

    "hot": "NODE_OPTIONS=\"--loader /var/task/loader.mjs\" sam local start-api --debug --disable-authorizer --skip-pull-image",

loader.mjs:

/*
- Our .pnp.loader.mjs uses the parentURL, to try to pickup .pnp.cjs which fails.
- It does not seem to use any of our env variables to set the yarn cache either.
- Pretend we are in /opt/nodejs/ so it picks up .pnp.cjs and the .yarn folder.
*/

const { default: pnp } = await import("/opt/nodejs/.pnp.cjs");
pnp.setup();

const fs = await import("node:fs");

const esmpnp = await import("/opt/nodejs/.pnp.loader.mjs");
export async function resolve(specifier, context, nextResolve) {
  const parentURL = context.parentURL?.replace("file:///var/task/", "file:///opt/nodejs/");
  const contexts = { ...context, parentURL };
  const resolve = await esmpnp.resolve(specifier, specifier.startsWith("./") ? context : contexts, nextResolve);
  // console.log({specifier, resolve});
  return resolve;
}

export async function load(urlString, context, nextLoad) {
  const load = await esmpnp.load(urlString, context, nextLoad);
  // Node ESM will wrap the CJS loader which does not use the monkey patched fs reader.
  if(load.format === 'commonjs') {
    if(urlString.startsWith("file:///opt/nodejs/.yarn/cache/")) {
      load.source = fs.readFileSync(urlString.slice("file://".length));
    }
  }
  return load;
}

To reproduce

Launch an ESM project putting the Yarn cache folder into a Layer Zip. This will wind up in /opt/nodejs/.yarn.

The ESM Loader will not work well with the /opt/nodejs folder as it is run from /var/task regardless of location. It may have a struggling finding the /opt/nodejs/.pnp.cjs file - which instead has to go into /var/task - but working around from that, the folder search path will still look in /var/task/.yarn and not /opt/nodejs/.yarn despite every attempt I could muster.

It's odd that yarn is not using any of the configuration variables for the .pnp.loader.mjs but with such a strict environment it's relatively unimportant as long as things just work.

Note in the exact same setup, the ".pnp.cjs" registration does work well for ".cjs" files.

          NODE_OPTIONS: -r /opt/nodejs/.pnp.cjs --loader /opt/nodejs/.pnp.loader.mjs
          # Switch to workaround
          # NODE_OPTIONS: --loader /var/task/hack.mjs
          YARN_CACHE_FOLDER: /opt/nodejs/.yarn/cache
          YARN_PNP_UNPLUGGED_FOLDER: /opt/nodejs/.yarn/unplugged
          YARN_VIRTUAL_FOLDER: /opt/nodejs/.yarn/cache
          YARN_INSTALL_STATE_PATH: /opt/nodejs/.yarn/install-state.gz
          YARN_PATCH_FOLDER: /opt/nodejs/.yarn/patches
          YARN_PNP_MODE: loose

Environment

This is using Yarn PNP 4.5.x  ESM and CJS loaders on the latest Lambda Node Image:

samcli/lambda-nodejs:20-x86_64-18095d962f12f8cd56fd3db91

Additional context

No response

Downchuck commented 2 days ago

EDIT: The following may be related to #5951 in handing back an empty source, but it seems intended to hand back an empty source to get picked back up from node's ESM CJS wrapper which should then go back into the .pnp.cjs zip handling but does not.

Updated the example loader to load CJS source as well; still broken if the CJS file has a require.

There seem to be some heavy caveats from Node.JS on how the loader should work for CJS: https://nodejs.org/api/module.html#loadurl-context-nextload

There also was discussion around challenges mixing CJS And ESM loader and registration here: https://github.com/nodejs/node/issues/47880

The thread shows as resolved -- but -- I don't think the registration applies when the entry-point is ESM.

It seems null and monkey patch should work, but are not, which may be related to how AWS SAM is loading things up... But that said, when providing source, resolve and other hooks should work, but perhaps they need register.


The .pnp.loader.mjs will still fall back into createCJSModuleWrap for sub-dependencies, which is in the node.js ESM loader and the file system patching doesn't seem to have taken. Populating the source attribute for commonjs files helped get through the following error:

{"errorType":"Error","errorMessage":"ENOTDIR: not a directory, open '/opt/nodejs/.yarn/cache/stream-chain-npm-3.2.0-44d181999c-03c71fbed6.zip/node_modules/stream-chain/src/index.js'","stackTrace":["Error: ENOTDIR: not a directory, open '/opt/nodejs/.yarn/cache/stream-chain-npm-3.2.0-44d181999c-03c71fbed6.zip/node_modules/stream-chain/src/index.js'","    at Object.openSync (node:fs:573:18)","    at readFileSync (node:fs:452:35)","    at getSourceSync (node:internal/modules/esm/load:85:14)","    at getSource (node:internal/modules/esm/translators:75:10)","    at createCJSModuleWrap (node:internal/modules/esm/translators:271:32)","    at ModuleLoader.commonjsStrategy (node:internal/modules/esm/translators:359:10)"],"errno":-20,"code":"ENOTDIR","syscall":"open","path":"/opt/nodejs/.yarn/cache/stream-chain-npm-3.2.0-44d181999c-03c71fbed6.zip/node_modules/stream-chain/src/index.js"}}

Unfortunately that was a temporary solution, as some CJS packages will have require("./file"); calls which are routed directly to the commonjs loader again in the Module._resolveFilename of the CJS loader. One would expect that this is already clobbered by pnp.setup(); appropriately, but I suspect the internal ModuleJob.run of node.js just isn't picking that up.

Downchuck commented 2 days ago

Ok this could have some redundancy, but I'm tired - so edits later:

This is using:

NODE_OPTIONS: -r /opt/nodejs/.pnp.cjs --loader /var/task/loader.mjs

Note -- the prepend has little to do with Yarn Berry and everything to do with getting in front of the AWS SAM loader.

Here is loader.mjs

/*
- Our .pnp.loader.mjs uses the parentURL, to try to pickup .pnp.cjs which fails.
- It does not seem to use any of our env variables to set the yarn cache either.
- Pretend we are in /opt/nodejs/ so it picks up .pnp.cjs and the .yarn folder.
*/
const esmpnp = await import("/opt/nodejs/.pnp.loader.mjs");

export async function resolve(specifier, context, nextResolve) {
  const parentURL = context.parentURL?.replace("file:///var/task/", "file:///opt/nodejs/");
  const contexts = { ...context, parentURL };
  const resolve = await esmpnp.resolve(specifier, specifier.startsWith("./") ? context : contexts, nextResolve);
  return resolve;
}

export async function load(urlString, context, nextLoad) {
  const load = await esmpnp.load(urlString, context, nextLoad);
  if(urlString === "file:///var/runtime/index.mjs") {
    const prepend = 'const { default: pnp } = await import("/opt/nodejs/.pnp.cjs"); pnp.setup(); const esmpnp = await import("/opt/nodejs/.pnp.loader.mjs");';   
    load.source = prepend + "\n" + load.source;
  }
  return load;
}