apify / got-scraping

HTTP client made for scraping based on got.
422 stars 32 forks source link

Cannot find module 'got-scraping' with Jest #125

Open Tapokj opened 6 months ago

Tapokj commented 6 months ago

Hello! I encountered the following error, using jest and typescript. Cannot find module 'got-scraping' from 'src/shared/extractor/html-loader.ts'

As you can understand, I have module that works in application well, but inside jest it causes a lot of pain. My version of got-scraping from package.json "got-scraping": "^4.0.3",

All esm modules inside project works fine, so I doubt wrong configuration, as well I tried to map module for jest using moduleNameMapper option, but it wasn't successful.

"jest": "^29.7.0", Jest version as well.

My jest config:

const config: Config = {
    extensionsToTreatAsEsm: ['.ts'],
    moduleNameMapper: {
        '^(\\.{1,2}/.*)\\.js$': '$1',
    },
    transform: {
        // '^.+\\.[tj]sx?$' to process js/ts with `ts-jest`
        // '^.+\\.m?[tj]sx?$' to process js/ts/mjs/mts with `ts-jest`
        '^.+\\.ts?$': [
            'ts-jest',
            {
                useESM: true,
            },
        ],
    },
    // All imported modules in your tests should be mocked automatically
    // automock: false,

    // Stop running tests after `n` failures
    // bail: 0,

    // The directory where Jest should store its cached dependency information
    // cacheDirectory: "/private/var/folders/_1/89rmjljj5_q7myblhj59727r0000gn/T/jest_dx",

    // Automatically clear mock calls, instances, contexts and results before every test
    clearMocks: true,

    // Indicates whether the coverage information should be collected while executing the test
    collectCoverage: true,

    // An array of glob patterns indicating a set of files for which coverage information should be collected
    // collectCoverageFrom: undefined,

    // The directory where Jest should output its coverage files
    coverageDirectory: 'coverage',

    // An array of regexp pattern strings used to skip coverage collection
    // coveragePathIgnorePatterns: [
    //   "/node_modules/"
    // ],

    // Indicates which provider should be used to instrument code for coverage
    coverageProvider: 'v8',

    // A list of reporter names that Jest uses when writing coverage reports
    // coverageReporters: [
    //   "json",
    //   "text",
    //   "lcov",
    //   "clover"
    // ],

    // An object that configures minimum threshold enforcement for coverage results
    // coverageThreshold: undefined,

    // A path to a custom dependency extractor
    // dependencyExtractor: undefined,

    // Make calling deprecated APIs throw helpful error messages
    // errorOnDeprecated: false,

    // The default configuration for fake timers
    // fakeTimers: {
    //   "enableGlobally": false
    // },

    // Force coverage collection from ignored files using an array of glob patterns
    // forceCoverageMatch: [],

    // A path to a module which exports an async function that is triggered once before all test suites
    // globalSetup: undefined,

    // A path to a module which exports an async function that is triggered once after all test suites
    // globalTeardown: undefined,

    // A set of global variables that need to be available in all test environments

    // The maximum amount of workers used to run your tests. Can be specified as % or a number. E.g. maxWorkers: 10% will use 10% of your CPU amount + 1 as the maximum worker number. maxWorkers: 2 will use a maximum of 2 workers.
    // maxWorkers: "50%",

    // An array of directory names to be searched recursively up from the requiring module's location
    // moduleDirectories: [
    //   "node_modules"
    // ],

    // An array of file extensions your modules use
    // moduleFileExtensions: [
    //   "js",
    //   "mjs",
    //   "cjs",
    //   "jsx",
    //   "ts",
    //   "tsx",
    //   "json",
    //   "node"
    // ],

    // A map from regular expressions to module names or to arrays of module names that allow to stub out resources with a single module
    // moduleNameMapper: {},

    // An array of regexp pattern strings, matched against all module paths before considered 'visible' to the module loader
    // modulePathIgnorePatterns: [],

    // Activates notifications for test results
    // notify: false,

    // An enum that specifies notification mode. Requires { notify: true }
    // notifyMode: "failure-change",

    // A preset that is used as a base for Jest's configuration
    preset: 'ts-jest',

    // Run tests from one or more projects
    // projects: undefined,

    // Use this configuration option to add custom reporters to Jest
    // reporters: undefined,

    // Automatically reset mock state before every test
    // resetMocks: false,

    // Reset the module registry before running each individual test
    // resetModules: false,

    // A path to a custom resolver
    // resolver: undefined,

    // Automatically restore mock state and implementation before every test
    // restoreMocks: false,

    // The root directory that Jest should scan for tests and modules within
    // rootDir: undefined,

    // A list of paths to directories that Jest should use to search for files in
    // roots: [
    //   "<rootDir>"
    // ],

    // Allows you to use a custom runner instead of Jest's default test runner
    // runner: "jest-runner",

    // The paths to modules that run some code to configure or set up the testing environment before each test
    setupFiles: ['dotenv/config'],

    // A list of paths to modules that run some code to configure or set up the testing framework before each test
    // setupFilesAfterEnv: [],

    // The number of seconds after which a test is considered as slow and reported as such in the results.
    // slowTestThreshold: 5,

    // A list of paths to snapshot serializer modules Jest should use for snapshot testing
    // snapshotSerializers: [],

    // The test environment that will be used for testing
    testEnvironment: 'node',

    // Options that will be passed to the testEnvironment
    // testEnvironmentOptions: {},

    // Adds a location field to test results
    // testLocationInResults: false,

    // The glob patterns Jest uses to detect test files
    // testMatch: [
    //   "**/__tests__/**/*.[jt]s?(x)",
    //   "**/?(*.)+(spec|test).[tj]s?(x)"
    // ],

    // An array of regexp pattern strings that are matched against all test paths, matched tests are skipped
    // testPathIgnorePatterns: [
    //   "/node_modules/"
    // ],

    // The regexp pattern or array of patterns that Jest uses to detect test files
    // testRegex: [],

    // This option allows the use of a custom results processor
    // testResultsProcessor: undefined,

    // This option allows use of a custom test runner
    // testRunner: "jest-circus/runner",

    // A map from regular expressions to paths to transformers
    // transform: undefined,

    // An array of regexp pattern strings that are matched against all source file paths, matched files will skip transformation
    // transformIgnorePatterns: [
    //   "/node_modules/",
    //   "\\.pnp\\.[^\\/]+$"
    // ],

    // An array of regexp pattern strings that are matched against all modules before the module loader will automatically return a mock for them
    // unmockedModulePathPatterns: undefined,

    // Indicates whether each individual test should be reported during the run
    // verbose: undefined,

    // An array of regexp patterns that are matched against all source file paths before re-running tests in watch mode
    // watchPathIgnorePatterns: [],

    // Whether to use watchman for file crawling
    // watchman: true,
}

Cheers!

FranciscoPinho commented 5 months ago

Would love a fix for this as well

vladfrangu commented 5 months ago

Hey! Out of curiosity, do you have an example project or repository we could clone to take a look at this? 🙏

Tsopic commented 5 months ago

Having similar issue:

node:internal/modules/cjs/loader:598
      throw e;
      ^

Error [ERR_PACKAGE_PATH_NOT_EXPORTED]: No "exports" main defined in /home/circleci/repo/node_modules/got-scraping/package.json
vladfrangu commented 5 months ago

Hey @Tsopic I'm not sure that is an identical error to the reported error in this issue thread. The module is now ESM only (read more about what that implies from Sindre's gist https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c).


@FranciscoPinho @Tapokj poling again! Would love a reproduction sample we can use so I can investigate and let you know the next steps to do!

Tsopic commented 5 months ago

Had the same error earlier,

Added these lines to jest.config.ts

  moduleNameMapper: {
    "got-scraping": "<rootDir>/node_modules/got-scraping/dist/index.js",
  },

But it's still whiny about the issue above. Probably is same error.

Tsopic commented 5 months ago

For further context, with the jest.config.ts in place. The error happens after building it into CommonJS, where the build succeeds but upon running the code it fails to find the module.

The reason seems to be that it's built into the not allowed require style const gotScraping = require("got-scraping")

using node v20

Did not have the issue with the earlier version of node did a bunch of package upgrades and node base version upgrade, and now it seems to be one of few packages with such issue.

vladfrangu commented 5 months ago

got-scraping, like it's core dependency got, are now ESM only. That means you cannot use require to import the module, but need to either use ESM or use await import('got-scraping') in an async context (see https://gist.github.com/sindresorhus/a39789f98801d908bbc7ff3ecc99d99c !!) 👀

If that's an issue with your current setup, you'll need to stick to versions before 4.x, but those versions are not getting new updates

dotspencer commented 4 months ago

Here's an example repo you can clone: https://github.com/dotspencer/jest-got-scraping/tree/ts

The main branch is js and works.

But if you jump to the ts branch, it fails in this way.

Test suite failed to run

Cannot find module 'got-scraping' from 'src/test.ts'

> 1 | import { gotScraping } from 'got-scraping';