nodejs / corepack

Zero-runtime-dependency package acting as bridge between Node projects and their package managers
MIT License
2.31k stars 145 forks source link

Frequent fetch() errors #458

Open xconverge opened 2 months ago

xconverge commented 2 months ago

I am encountering this a LOT intermittently during CI for my projects. I can rerun CI and then it is all fine. I have looked at the troubleshooting referenced but I am not behind a proxy so am not really sure what else it could be. My suspicion is some sort of load oriented timeout or npm rate limiting per IP. It could be intermittent internet too, but without seeing the reason for the exception it is hard for me to debug effectively what is happening

Internal Error: Error when performing the request to https://registry.npmjs.org/yarn; for troubleshooting help, see https://github.com/nodejs/corepack#troubleshooting
    at fetch (/usr/local/lib/node_modules/corepack/dist/lib/corepack.cjs:22882:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async fetchAsJson (/usr/local/lib/node_modules/corepack/dist/lib/corepack.cjs:22896:20)
    at async fetchLatestStableVersion (/usr/local/lib/node_modules/corepack/dist/lib/corepack.cjs:22948:20)
    at async fetchLatestStableVersion2 (/usr/local/lib/node_modules/corepack/dist/lib/corepack.cjs:22971:14)
    at async Engine.getDefaultVersion (/usr/local/lib/node_modules/corepack/dist/lib/corepack.cjs:23349:25)
    at async executePackageManagerRequest (/usr/local/lib/node_modules/corepack/dist/lib/corepack.cjs:24207:28)
    at async BinaryCommand.validateAndExecute (/usr/local/lib/node_modules/corepack/dist/lib/corepack.cjs:21173:22)
    at async _Cli.run (/usr/local/lib/node_modules/corepack/dist/lib/corepack.cjs:22148:18)
    at async Object.runMain (/usr/local/lib/node_modules/corepack/dist/lib/corepack.cjs:24279:12)

I see there is some data in the throw here: https://github.com/nodejs/corepack/blob/142319056424b1e0da2bdbe801c52c5910023707/sources/httpUtils.ts#L47-L50

Is there a way for me to see the contents of this easily, I am not sure where it ends up. If not (I doubt this will be the answer) could we perhaps move the err to the text instead?

xconverge commented 2 months ago

So far I have tried:

  1. Changing from ISP provided DNS server to cloudflare or google DNS on the server
  2. export NODE_OPTIONS="--dns-result-order=ipv4first --no-network-family-autoselection"

With no change in behavior

Not sure I have any other knobs to turn on my end...

I will update to the latest https://github.com/nodejs/corepack/releases/tag/v0.27.0 and see if that helps, there are a few optimizations there that might help ( fix: download fewer metadata from npm registry #436 )

xconverge commented 2 months ago

I still experience this issue with the newly released 0.27.0

xconverge commented 2 months ago

I will be reverting to 0.24.0 (pre fetch() being used) for the time being :/

I wish I knew which one of these it was (probably?), but potentially if there was a way to set some of these timeouts that would be great.... https://undici.nodejs.org/#/docs/api/Errors?id=errors

merceyz commented 2 months ago

https://github.com/nodejs/corepack/pull/430 should start printing the cause property of that error so if you can re-test once that is released then that should provide the missing information.

xconverge commented 2 months ago

Give me a few days to get back to you. I have been running 0.28.0 a lot to try and get the reason/cause and havent seen the issue at all since....

I will give it a few days and then I will step it back to 0.27.0 (would be wild if this made a difference looking at just the change related to clipanion.,...) or 0.26.0 just to confirm that the issue is still reproducible-ish on my setup. There is also a chance it was ISP related and is no longer a problem for me?

Just wanted to confirm that I am on it to followup, but it will take some time to be deterministic/conclusive, but as of now on 0.28.0, things are looking better unfortunately? 😆

xconverge commented 2 months ago

Don't really know what to make of it, I haven't had an issue since my previous posts. I switched to 0.27.0 for a bit and still didn't have an issue. This will just have to remain a mystery until someone else encounters issues and we now see the cause. For now I will stay on 0.28.0 or the latest and report back if anything ever happens

guyca commented 1 month ago

FWIW I encountered this issue while executing GitHub actions on Node 16. Switching to Node 18 has resolved the issue.

xconverge commented 1 month ago

Saw a handful of CI jobs fail today, kind of a normal error when the system/server/network is overloaded a bit but still pretty inconvenient situation that is not currently overridable with ENV vars (for corepack to use when creating the undici fetch request, agent, and dispatcher) or anything currently I don't think.

For what it's worth, I experienced this back when I switched from axios -> undici fetch on several of my projects due to axios not having any timeouts as default, and fetch having sane defaults. https://github.com/nodejs/corepack/pull/365 might have experienced a similar change from https -> undici fetch with timeouts changing significantly

If I had my choice, I would override the defaults to just double them (since it is just CI for me) and move on for now. I don't think changing the corepack default fetch parameters makes sense for everyone/this repository/corepack itself for this particular situation. Increasing them by default in corepack could be beneficial though without many downsides

/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:22534
    throw new Error(
          ^
Error: Error when performing the request to https://registry.npmjs.org/yarn/latest; for troubleshooting help, see https://github.com/nodejs/corepack#troubleshooting
    at fetch (/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:22534:11)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async fetchAsJson (/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:22548:20)
    ... 4 lines matching cause stack trace ...
    at async Object.runMain (/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:24007:5) {
  [cause]: TypeError: fetch failed
      at node:internal/deps/undici/undici:12502:13
      at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
      at async fetch (/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:22528:16)
      at async fetchAsJson (/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:22548:20)
      at async fetchLatestStableVersion (/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:22475:20)
      at async fetchLatestStableVersion2 (/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:22598:14)
      at async Engine.getDefaultVersion (/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:23208:23)
      at async Engine.executePackageManagerRequest (/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:23300:47)
      at async Object.runMain (/usr/lib/node_modules/corepack/dist/lib/corepack.cjs:24007:5) {
    [cause]: ConnectTimeoutError: Connect Timeout Error
        at onConnectTimeout (node:internal/deps/undici/undici:6635:28)
        at node:internal/deps/undici/undici:6587:50
        at Immediate._onImmediate (node:internal/deps/undici/undici:6619:13)
        at process.processImmediate (node:internal/timers:478:21) {
      code: 'UND_ERR_CONNECT_TIMEOUT'
    }
  }
}