nodejs / import-in-the-middle

Like `require-in-the-middle`, but for ESM import
https://www.npmjs.com/package/import-in-the-middle
Apache License 2.0
72 stars 27 forks source link

Fails to resolve `exports` defined submodule when reexported by another package #63

Closed isaacs closed 5 months ago

isaacs commented 10 months ago

node_modules/has-submodule/package.json

{
  "name": "has-submodule",
  "type": "module",
  "exports": {
    "./sub": "./sub.js"
  }
}

node_modules/has-submodule/sub.js

export const foo = 'bar'
console.log('loaded submodule')
cat: node_modules/reexport-submodule/sub.js: No such file or directory

node_modules/reexport-submodule/package.json

{
  "name": "reexport-submodule",
  "exports": {
    ".": "./index.js"
  },
  "type": "module"
}

node_modules/reexport-submodule/index.js

export * from 'has-submodule/sub'

load-sub.mjs

import { foo } from 'reexport-submodule'
console.log(foo)

esm-loader.mjs

export * from 'import-in-the-middle/hook.mjs'

Expected Behavior

$ node load-sub.mjs
loaded submodule
bar
$ node --loader=./esm-loader.mjs load-sub.mjs
loaded submodule
bar

Actual Behavior

$ node --loader=./esm-loader.mjs load-sub.mjs
(node:36249) ExperimentalWarning: `--experimental-loader` may be removed in the future; instead use `register()`:
--import 'data:text/javascript,import { register } from "node:module"; import { pathToFileURL } from "node:url"; register("./esm-loader.mjs", pathToFileURL("./"));'
(Use `node --trace-warnings ...` to show where the warning was created)

node:internal/process/esm_loader:40
      internalBinding('errors').triggerUncaughtException(
                                ^
[Error: ENOENT: no such file or directory, open '/Users/isaacs/dev/tapjs/esm-tap-repro/node_modules/reexport-submodule/has-submodule/sub'] {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: '/Users/isaacs/dev/tapjs/esm-tap-repro/node_modules/reexport-submodule/has-submodule/sub'
}

Node.js v20.9.0

Steps to Reproduce the Problem

Shown above.

jsumners-nr commented 10 months ago

At https://github.com/DataDog/import-in-the-middle/blob/c3c2c52c1915b47994af59d507c59029c1f1fae9/hook.js#L156-L157

When modFile is has-submodule/sub, and srcUrl is file:///private/tmp/29/tap-issue/node_modules/reexport-submodule/index.js, then modUrl is defined as file:///private/tmp/29/tap-issue/node_modules/reexport-submodule/has-submodule/sub.

meyer9 commented 9 months ago

Similar issue trying to import langchain:

modUrl: file:///Users/meyer9/development/repro/node_modules/langchain/dist/embeddings/@langchain/core/embeddings
modFile: @langchain/core/embeddings
normalizedModName: embeddings
srcUrl: file:///Users/meyer9/development/repro/node_modules/langchain/dist/embeddings/base.js
node:internal/process/esm_loader:40
      internalBinding('errors').triggerUncaughtException(
                                ^
[Error: ENOENT: no such file or directory, open '/Users/meyer9/development/repro/node_modules/langchain/dist/embeddings/@langchain/core/embeddings'] {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: '/Users/meyer9/development/repro/node_modules/langchain/dist/embeddings/@langchain/core/embeddings'
}

The line in question:

export * from "@langchain/core/embeddings";

Seems like if it's an external package, this logic won't work either. Could this be used instead? https://nodejs.org/api/esm.html#importmetaresolvespecifier

Edit: oh, I see you can't do that because import-in-the-middle isn't type module.

I tried adding require resolution here and it sorta worked, but I got another issue with a duplicate identifier.

      if (!modFile.startsWith('.')) {
        modFile = require.resolve(modFile, {
          paths: [new URL(srcUrl).pathname]
        })
      }

Oh, looks like someone had the exact same thought process: https://github.com/DataDog/import-in-the-middle/issues/60. Looks like #59 could be a dupe.

dawnmist commented 9 months ago

62 is also potentially a dupe (with repro), displaying the same issue of re-exports of another module being incorrectly treated as subdirectories/files within the module doing the re-export.

ida613 commented 8 months ago

Not able to reproduce this issue, can you confirm the IITM version you are using?

khanayan123 commented 8 months ago

I was able to reproduce the issue on master, looking into it

chentsulin commented 6 months ago

@khanayan123 Is there any update? I'd like to help if there are some failed tests can be investigated.

timfish commented 6 months ago

I've been looking into this with the following reliable reproduction using Node v22.2.0:

import { register } from 'node:module';

register('import-in-the-middle/hook.mjs', import.meta.url);
await import ('@discordjs/builders');

Which results in:

node:internal/modules/run_main:125
    triggerUncaughtException(
    ^
Error: ENOENT: no such file or directory, open '/Users/tim/Documents/Repositories/repro/node_modules/@discordjs/builders/dist/@discordjs/formatters'
    at async open (node:internal/fs/promises:638:25)
    at async readFile (node:internal/fs/promises:1251:14)
    at async getSource (node:internal/modules/esm/load:46:14)
    at async defaultLoad (node:internal/modules/esm/load:137:34)
    at async nextLoad (node:internal/modules/esm/hooks:750:22)
    at async getExports (/Users/tim/Documents/Repositories/repro/node_modules/import-in-the-middle/lib/get-exports.js:64:21)
    at async processModule (/Users/tim/Documents/Repositories/repro/node_modules/import-in-the-middle/hook.js:136:23)
    at async processModule (/Users/tim/Documents/Repositories/repro/node_modules/import-in-the-middle/hook.js:164:20)
    at async getSource (/Users/tim/Documents/Repositories/repro/node_modules/import-in-the-middle/hook.js:274:60)
    at async load (/Users/tim/Documents/Repositories/repro/node_modules/import-in-the-middle/hook.js:340:26) {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: '/Users/tim/Documents/Repositories/repro/node_modules/@discordjs/builders/dist/@discordjs/formatters'
}

It appears that this code expects the module it's loading to be relative to the current file: https://github.com/DataDog/import-in-the-middle/blob/00b01fff1f5b69dd25e307593ec54d1d8abb4844/hook.js#L154-L166

My proposal is that if it's a bare specifier, we call parentResolve to resolve the package URL. Using resolve is better than manually trying to find the correct files since there might be multiple levels of node_modules directories that need traversing and we need to respect package.json#exports when we find the package!

I'll attempt a PR!

timfish commented 6 months ago

This should have been resolved by #78