elastic / require-in-the-middle

Module to hook into the Node.js require function
MIT License
163 stars 26 forks source link

Use this module with bundled node modules (using webpack) #35

Closed mruoss closed 4 years ago

mruoss commented 4 years ago

Hi there

I'm coming over from an issue on DataDog/dd-trace-js#827 where I figured out, the issue actually lies in this library.

We are using webpack to bundle our application and all the required node modules in a single bundle. Now we installed dd-trace-js which uses require-in-the-middle to hook up its plugins. However, for us the require-in-the-middle does not work if the node modules are bundled. Is this a know inssue? Is there a way to use require-in-the-middle with bundled node modules?

I worked around the issue using webpack-node-externals but I would prefer going back to including the node modules in the bundle.

Thanks, Michael

watson commented 4 years ago

Correct, this module doesn't work if all your modules have been bundled into a single JavaScript file. This module works as a hook into the require function in Node.js. When the require function isn't used (as is the case when you bundle all your modules), the hook will never fire.

To my knowledge, there's no way to really support this feature and it's not a bug. Normally this is never an issue as bundling of Node.js modules on the server-side isn't recommended in general.

Can I ask what you hope to achieve by bundling all your modules into one file?

mruoss commented 4 years ago

I see, thanks for the clarification. We're bundling all our modules in order to reduce our (docker image) build time. But in this case, I can close this issue - my question was answered and there seems to be no support soon.

vitramir commented 4 years ago

reduce our (docker image) build time

Our team has the same reason. Reduce of final bundle size is also important for docker images and for lambdas.

@watson can you explain more detailed why do think it is impossible, please?

Compiled file has __webpack_require__ function. I think we can replace it the same way you replace require

watson commented 4 years ago

@vitramir I didn't know about __webpack_require__. If it's possible somehow to hook into that, then it might be possible to make this module work for this scenario as well. I don't have any time currently to dive into this, unfortunately, but if any of you like to take a crack at it, I'd be happy to review ☺️

SergeNarhi commented 4 years ago

@watson @vitramir @mruoss I made a research and found that it is impossible to implement inside the library. Here the guys from Google mentioned similar results and explains why. Google's library built on TypeScript but uses in common the same mechanism. The only way to add support for webpack is to create a plugin that wraps exports or __webpack_require__.

vitramir commented 4 years ago

@skanygin They only mentioned that it is impossible to support webpack with the same approach. Yes, I am also talking about some kind of plugin.

In transpiled file all require statements replaced with __webpack_require__. At some point of execution require-in-the-middle will try to replace default require. We need to replace __webpack_require__ here, instead of default require. I made it work. My next issue is that webpack uses numeric indices as input to __webpack_require__, but require-in-the-middle needs package name. So, it executes for every require, but doesn’t inject any code. My next step will be to find a way to map webpack indices back to packages names/paths.

vitramir commented 4 years ago

I finished the first prototype of webpack plugin. This seems to work for my case.

https://github.com/vitramir/require-in-the-middle-webpack-example

There are 3 main steps:

  1. Collect paths of modules
  2. Update webpack runtime code
  3. Update require-in-the-middle source code

All steps are commented in code.

@watson Can you advise something about changes in require-in-the-middle sources, please: https://github.com/vitramir/require-in-the-middle-webpack-example/blob/master/webpack.config.js#L112-L167

I don't like to make them by plugin.

techmunk commented 3 years ago

For anyone interested in this, I've written a Webpack Plugin that can do this WITHOUT needing to patch or change code in require-in-the-middle.

What follows is typescript code, so make changes as needed for JS.

import { relative, sep } from 'path'
import { compilation, Compiler, Template } from 'webpack'

declare class CompilationModule extends compilation.Module {
  request?: string
  resource?: string
  rawRequest?: string
  external?: boolean
}

export class WebpackRequireInTheMiddlePlugin {
  public readonly name = 'WebpackRequireInTheMiddlePlugin'
  protected readonly modulesMap: Map<number | string | null, [string, boolean, string?]>
  protected readonly modules: string[]
  protected readonly internalModuleConditions: string[]
  protected addShims = true
  protected fsModuleId?: string | number | null
  protected resolveModuleId?: string | number | null
  protected moduleIds: Map<string, number | string | null | undefined>

  public constructor(modules?: string[], internalModules?: string[]) {
    this.modulesMap = new Map()
    this.moduleIds = new Map()
    this.modules = modules ?? []
    this.internalModuleConditions = internalModules ?? []
  }

  public apply(compiler: Compiler): void {
    compiler.hooks.compilation.tap(this.name, compilation => this.compilation(compilation))
  }

  protected compilation(compilation: compilation.Compilation): void {
    compilation.hooks.afterOptimizeModuleIds.tap(this.name, modules => this.mapModuleIds(modules))
    compilation.mainTemplate.hooks.localVars.tap(this.name, (source) => this.addLocalVarSources(source))
    compilation.mainTemplate.hooks.require.tap(this.name, (source) => this.addRequireSources(source))
  }

  protected getModuleName(filename?: string): string {
    if (filename) {
      const segments = filename.split(sep)
      const index = segments.lastIndexOf('node_modules')
      if (index !== -1 && segments[index + 1]) {
        return segments[index + 1][0] === '@' ? `${segments[index + 1]}/${segments[index + 2]}` : segments[index + 1]
      }
    }

    return ''
  }

  protected canSkipShimming(module: CompilationModule): boolean {
    if (module.external && module.request) {
      return this.internalModuleConditions.includes(module.request)
    }
    return false
  }

  protected includeModule(module: CompilationModule): boolean {
    const moduleName = this.getModuleName(module.resource)
    return this.modules.length === 0 || (moduleName !== '' && this.modules.includes(moduleName))
  }

  protected mapModuleIds(modules: CompilationModule[]): void {
    for (const module of modules) {
      if (this.canSkipShimming(module)) {
        break
      }
      if (!module.external && module.resource) {
        if (this.includeModule(module)) {
          this.modulesMap.set(module.id, [relative(`${process.cwd()}/node_modules`, module.resource), false])
          if (this.getModuleName(module.resource) === module.rawRequest) {
            this.moduleIds.set(module.rawRequest, module.id)
            // eslint-disable-next-line @typescript-eslint/no-var-requires
            const { version } = require(`${module.rawRequest}/package.json`)
            this.modulesMap.set(module.id, [relative(`${process.cwd()}/node_modules`, module.resource), false, version])
          }
        }
        if (module.resource.includes('resolve/index.js')) {
          this.resolveModuleId = module.id
        }
      }
      else if (module.request) {
        if (this.modules.includes(module.request)) {
          this.modulesMap.set(module.id, [module.request, true])
        }
        if (module.request === 'fs') {
          this.fsModuleId = module.id
        }
      }
    }
  }

  protected getRequireShim(): string[] {
    return [
      'const __ritm_require__ = __ritm_Module__.prototype.require',
      'const __ritm_require_shim__ = function (id) {',
      Template.indent([
        'return modules[id] ? __webpack_require__(id) : __ritm_require__.apply(this, arguments)'
      ]),
      '}',
      '__ritm_Module__.prototype.require = __ritm_require_shim__'
    ]
  }

  protected getResolveFilenameShim(): string[] {
    return [
      'const __ritm_resolve_filename__ = __ritm_Module__._resolveFilename',
      '__ritm_Module__._resolveFilename = function (id) {',
      Template.indent([
        'if (modules[id] && __ritm_modules_map__.has(id)) {',
        Template.indent([
          'const [filename, core] = __ritm_modules_map__.get(id)',
          // eslint-disable-next-line no-template-curly-in-string
          'return core ? filename : `${process.cwd()}${sep}node_modules${sep}${filename}`'
        ]),
        '}',
        'return __ritm_resolve_filename__.apply(this, arguments)'
      ]),
      '}'
    ]
  }

  protected addLocalVarSources(source: string): string {
    return !this.addShims ? source : Template.asString([
      source,
      'const { sep } = require("path")',
      `const __ritm_modules_map__ = new Map(${JSON.stringify(Array.from(this.modulesMap.entries()), null, 2)})`,
      `const __ritm_module_ids_map__ = new Map(${JSON.stringify(Array.from(this.moduleIds.entries()), null, 2)})`,
      'const __ritm_Module__ = module.require("module")',
      ...this.getRequireShim(),
      ...this.getResolveFilenameShim(),
      'const __ritm_shimmed__ = {}'
    ])
  }

  protected getFsShim(): string[] {
    if (this.fsModuleId) {
      return [
        `const __ritm_fs_readFileSync__ = __webpack_require__(${this.fsModuleId}).readFileSync`,
        `installedModules[${this.fsModuleId}].exports.readFileSync = function(path) {`,
        Template.indent([
          'const [module, file] = path.split(sep).slice(-2)',
          'if (file === "package.json" && __ritm_module_ids_map__.has(module)) {',
          Template.indent([
            'const version = __ritm_modules_map__.get(__ritm_module_ids_map__.get(module)).slice(-1)',
            // eslint-disable-next-line no-template-curly-in-string
            'return `{"version": "${version}"}`'
          ]),
          '}',
          'return __ritm_fs_readFileSync__.apply(this, arguments)'
        ]),
        '}'
      ]
    }
    return []
  }

  protected getResolveModuleShim(): string[] {
    if (this.resolveModuleId) {
      return [
        `const __ritm_resolve_sync__ = __webpack_require__(${this.resolveModuleId})`,
        `installedModules[${this.resolveModuleId}].exports.sync = function(name) {`,
        Template.indent([
          'if (__ritm_module_ids_map__.has(name)) {',
          Template.indent([
            'const [filename, core] = __ritm_modules_map__.get(__ritm_module_ids_map__.get(name))',
            // eslint-disable-next-line no-template-curly-in-string
            'return core ? filename : `${process.cwd()}${sep}node_modules${sep}${filename}`'
          ]),
          '}',
          'return __ritm_resolve_sync__.apply(this, arguments)'
        ]),
        '}'
      ]
    }
    return []
  }

  protected getRequireResolveShim(): string[] {
    return [
      'const __ritm_require_resolve__ = require.resolve',
      'require.resolve = function(name) {',
      Template.indent([
        'if (__ritm_module_ids_map__.has(name)) {',
        Template.indent([
          'const [filename, core] = __ritm_modules_map__.get(__ritm_module_ids_map__.get(name))',
          // eslint-disable-next-line no-template-curly-in-string
          'return core ? filename : `${process.cwd()}${sep}node_modules${sep}${filename}`'
        ]),
        '}',
        'return __ritm_require_resolve__.apply(this, arguments)'
      ]),
      '}'
    ]
  }

  protected getShims(): string[] {
    return [
      ...this.getFsShim(),
      ...this.getResolveModuleShim(),
      ...this.getRequireResolveShim()
    ]
  }

  protected getResetShims(): string[] {
    let reset: string[] = []
    if (this.fsModuleId) {
      reset = [
        ...reset,
        `installedModules[${this.fsModuleId}].exports.readFileSync = __ritm_fs_readFileSync__`
      ]
    }
    if (this.resolveModuleId) {
      reset = [
        ...reset,
        `installedModules[${this.resolveModuleId}].exports.readFileSync = __ritm_resolve_sync__`
      ]
    }

    return reset
  }

  protected addRequireSources(source: string): string {
    return !this.addShims ? source : Template.asString([
      'if (__ritm_Module__.prototype.require !== __ritm_require_shim__ && !__ritm_shimmed__[moduleId]) {',
      Template.indent([
        '__ritm_shimmed__[moduleId] = true',
        'if (__ritm_modules_map__.has(moduleId)) {',
        Template.indent([
          ...this.getShims(),
          'const exports = __ritm_Module__.prototype.require(moduleId)',
          'installedModules[moduleId].exports = exports',
          ...this.getResetShims()
        ]),
        '}'
      ]),
      '}',
      source
    ])
  }
}

To use:

    const modules = [
      'apollo-server-core',
      'bluebird',
      'cassandra-driver',
      'elasticsearch',
      'express',
      'express-graphql',
      'express-queue',
      'fastify',
      'finalhandler',
      'generic-pool',
      'graphql',
      'handlebars',
      'hapi',
      '@hapi/hapi',
      'http',
      'https',
      'http2',
      'ioredis',
      'jade',
      'knex',
      'koa',
      'koa-router',
      '@koa/router',
      'memcached',
      'mimic-response',
      'mongodb-core',
      'mongodb',
      'mysql',
      'mysql2',
      'pg',
      'pug',
      'redis',
      'restify',
      'tedious',
      'ws'
    ]

this.config.plugins?.push(new WebpackRequireInTheMiddlePlugin(modules))

The list of modules above is taken straight from elastic-apm-node. If elastic-apm-node exported the modules, would probably not even need that!

This is tested against webpack 4 with an effort to get elastic-apm-node to work with a full bundled webpack build.

techmunk commented 3 years ago

I should also point out that the entry looks like the following:

{ server: ['source-map-support/register', 'elastic-apm-node/start', './index'] }

We configure elastic APM via environment variables.

techmunk commented 3 years ago

I had not properly tested when using a built bundle in a directory without node_modules. I've updated my comment above with a version that will work when there is no node_modules directory present. Part of the plugin is really only for supporting elastic-apm-nodejs.

knvpk commented 2 years ago

Hi @vitramir , do you still using the code snippet you provided, actually im facing the same issue with opentelemetry(which uses require-in-the-middle) so i added your snippet and im getting the error that compilation.hooks.succeedModule.tap is deprecated.

knvpk commented 2 years ago

Hi @techmunk , i did also checked with your solution but im getting "modules not defiend",

Chiorufarewerin commented 2 years ago

Same issue with opentelemetry and angular

tlhunter commented 1 year ago

I suspect @techmunk's solution in https://github.com/elastic/require-in-the-middle/issues/35#issuecomment-710807619 might have been written for Webpack v4 due to the incompatibility with today's Webpack (or possibly it relied upon non-public variables). I, too, am unable to get it to work anymore.

sibelius commented 8 months ago

does anybody have a version of this for webpack 5 ?