nodejs / undici

An HTTP/1.1 client, written from scratch for Node.js
https://nodejs.github.io/undici
MIT License
6.26k stars 548 forks source link

TypeError: fetch failed - on Node v20.11.1 and v21.7.1, but works on v18.19.1 - likely issue with resolving redirect URL or IPv6 #2990

Closed gregonarash closed 8 months ago

gregonarash commented 8 months ago

Bug Description

Making fetch request fails with TypeError and array of errors.

Originally this error showed up by breaking my NextJS project, when upgrading to Next v14. Digging through NextJS issues relating to TypeError: fetch failed I found out that this was relating to undici

Reproducible By

Executing below in command line (on Node v20.11.1 and v21.7.1)

node -e "fetch('https://airtable.com').then(res => console.log(res.status))"

will result in TypeError: fetch failed

Expected Behavior

Expected result is 200. For comparison:

1) Executing below in command line

curl -s -o /dev/null -w "%{http_code}\n" https://airtable.com/

will result in 301

2) Executing below in command line (on Node v18.19.1 )

node -e "fetch('https://airtable.com').then(res => console.log(res.status))"

will result in 200

3) Fetching resolved URL with (www.) on Node v20.11.1 and v21.7.1:

node -e "fetch('https://www.airtable.com').then(res => console.log(res.status))"

will result in 200

Originally I was accessing Airtable API, but I was able to reproduce the error just on the homepage.

Logs & Screenshots

(main)$ node -e "fetch('https://airtable.com').then(res => console.log(res.status))"
node:internal/deps/undici/undici:12345
    Error.captureStackTrace(err, this);
          ^

TypeError: fetch failed
    at node:internal/deps/undici/undici:12345:11
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: AggregateError
      at internalConnectMultiple (node:net:1114:18)
      at internalConnectMultiple (node:net:1177:5)
      at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
      at listOnTimeout (node:internal/timers:575:11)
      at process.processTimers (node:internal/timers:514:7) {
    code: 'ETIMEDOUT',
    [errors]: [
      Error: connect ETIMEDOUT 35.170.194.183:443
          at createConnectionError (node:net:1634:14)
          at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -110,
        code: 'ETIMEDOUT',
        syscall: 'connect',
        address: '35.170.194.183',
        port: 443
      },
      Error: connect ENETUNREACH 2600:1f18:7473:c20a:7850:a38:4412:3c8b:443 - Local (:::0)
          at internalConnectMultiple (node:net:1176:40)
          at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -101,
        code: 'ENETUNREACH',
        syscall: 'connect',
        address: '2600:1f18:7473:c20a:7850:a38:4412:3c8b',
        port: 443
      },
      Error: connect ETIMEDOUT 35.175.168.177:443
          at createConnectionError (node:net:1634:14)
          at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -110,
        code: 'ETIMEDOUT',
        syscall: 'connect',
        address: '35.175.168.177',
        port: 443
      },
      Error: connect ENETUNREACH 2600:1f18:7473:c20c:3e2d:d0be:734e:7edd:443 - Local (:::0)
          at internalConnectMultiple (node:net:1176:40)
          at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -101,
        code: 'ENETUNREACH',
        syscall: 'connect',
        address: '2600:1f18:7473:c20c:3e2d:d0be:734e:7edd',
        port: 443
      },
      Error: connect ETIMEDOUT 3.212.186.72:443
          at createConnectionError (node:net:1634:14)
          at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -110,
        code: 'ETIMEDOUT',
        syscall: 'connect',
        address: '3.212.186.72',
        port: 443
      },
      Error: connect ENETUNREACH 2600:1f18:7473:c20b:b899:9668:8cb8:8f26:443 - Local (:::0)
          at internalConnectMultiple (node:net:1176:40)
          at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -101,
        code: 'ENETUNREACH',
        syscall: 'connect',
        address: '2600:1f18:7473:c20b:b899:9668:8cb8:8f26',
        port: 443
      },
      Error: connect ETIMEDOUT 3.230.15.214:443
          at createConnectionError (node:net:1634:14)
          at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -110,
        code: 'ETIMEDOUT',
        syscall: 'connect',
        address: '3.230.15.214',
        port: 443
      },
      Error: connect ENETUNREACH 2600:1f18:7473:c20d:61ae:6cc5:2977:950d:443 - Local (:::0)
          at internalConnectMultiple (node:net:1176:40)
          at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -101,
        code: 'ENETUNREACH',
        syscall: 'connect',
        address: '2600:1f18:7473:c20d:61ae:6cc5:2977:950d',
        port: 443
      },
      Error: connect ETIMEDOUT 34.203.185.25:443
          at createConnectionError (node:net:1634:14)
          at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -110,
        code: 'ETIMEDOUT',
        syscall: 'connect',
        address: '34.203.185.25',
        port: 443
      },
      Error: connect ENETUNREACH 2600:1f18:7473:c20b:c68:6fb2:da10:d418:443 - Local (:::0)
          at internalConnectMultiple (node:net:1176:40)
          at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -101,
        code: 'ENETUNREACH',
        syscall: 'connect',
        address: '2600:1f18:7473:c20b:c68:6fb2:da10:d418',
        port: 443
      },
      Error: connect ETIMEDOUT 52.5.95.135:443
          at createConnectionError (node:net:1634:14)
          at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -110,
        code: 'ETIMEDOUT',
        syscall: 'connect',
        address: '52.5.95.135',
        port: 443
      },
      Error: connect ENETUNREACH 2600:1f18:7473:c20c:91a2:5f27:c209:f54:443 - Local (:::0)
          at internalConnectMultiple (node:net:1176:40)
          at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -101,
        code: 'ENETUNREACH',
        syscall: 'connect',
        address: '2600:1f18:7473:c20c:91a2:5f27:c209:f54',
        port: 443
      },
      Error: connect ETIMEDOUT 107.22.1.125:443
          at createConnectionError (node:net:1634:14)
          at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -110,
        code: 'ETIMEDOUT',
        syscall: 'connect',
        address: '107.22.1.125',
        port: 443
      },
      Error: connect ENETUNREACH 2600:1f18:7473:c20c:5ea9:a163:5f8:332a:443 - Local (:::0)
          at internalConnectMultiple (node:net:1176:40)
          at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -101,
        code: 'ENETUNREACH',
        syscall: 'connect',
        address: '2600:1f18:7473:c20c:5ea9:a163:5f8:332a',
        port: 443
      },
      Error: connect ETIMEDOUT 34.231.149.203:443
          at createConnectionError (node:net:1634:14)
          at Timeout.internalConnectMultipleTimeout (node:net:1685:38)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -110,
        code: 'ETIMEDOUT',
        syscall: 'connect',
        address: '34.231.149.203',
        port: 443
      },
      Error: connect ENETUNREACH 2600:1f18:7473:c20a:23bb:7de7:12a2:6bb8:443 - Local (:::0)
          at internalConnectMultiple (node:net:1176:40)
          at Timeout.internalConnectMultipleTimeout (node:net:1687:3)
          at listOnTimeout (node:internal/timers:575:11)
          at process.processTimers (node:internal/timers:514:7) {
        errno: -101,
        code: 'ENETUNREACH',
        syscall: 'connect',
        address: '2600:1f18:7473:c20a:23bb:7de7:12a2:6bb8',
        port: 443
      }
    ]
  }
}

Node.js v20.11.1

At the same time there is no issue in reaching that URL with cURL

image

Environment

Additional context

Windows 10 running WSL 2 with Ubuntu 20.04.6 LTS

mcollina commented 8 months ago

I can't reproduce unfortunately :(. Is there a proxy configured?

gregonarash commented 8 months ago

@mcollina thank you for looking into to that! There is no proxy configured.

Today I cannot replicate the error as well. The only thing that has changed in between (that I can think of) is that I have updated NVM.

I have also tried in a few ways on a different machine - I am not able to replicate it again. In this case should I close the issue?

mcollina commented 8 months ago

Yes, thanks for reporting

realyukii commented 7 months ago

@mcollina thank you for looking into to that! There is no proxy configured.

Today I cannot replicate the error as well. The only thing that has changed in between (that I can think of) is that I have updated NVM.

I have also tried in a few ways on a different machine - I am not able to replicate it again. In this case should I close the issue?

may I know what version of node you are using?

gregonarash commented 7 months ago

@RealYukiSan I tested on v20.11.1 and v21.7.1, where it was failing (on some URLs not all), at the same time CURL was working fine on the same pages and it was working when I changed node to v18.19.1.

However a week later , there was no more error. Ether it was related to me updating NVM ( cant say why) or maybe this was related to DNS setup of target URL... can' really tell.

gregonarash commented 1 month ago

@mcollina @RealYukiSan I believe I have found the root cause of the most of ETIMOUT across most of the issues.

TLDR Node has a default timeout for selecting/connecting ipv4 or ipv6 address which is 250ms. This default is used by undici and no of the timeout settings in undici allows to modify it.

What solved the issue completely for me was increasing that timeout to: export NODE_OPTIONS="--network-family-autoselection-attempt-timeout=500"

Likely explanation

This error was popping up for me only on Airtable API and only via node fetch/undici, no problem with other APIs, no problem with cURL accessing the same API.

It hard to replicate - because day a few days after original issue started it went away by itself. Same situation with builds no vercel. Likely on some days 250ms is enough on other days with more traffic it was not enough

Here is a screenshot of connecting with NODE_DEBUG=net to Airtable API - the DNS resolves to 10 different IPs half ipv4 half ipv6 and then you can see that attempt timeout is set to 250 ms - on localhost / home network this times out 9 out 10 times (this week). On localhost on my office network it would connect about 50% of the time image image

Some of the solutions suggested on other threads/issues which suggested turning off the ipv6 or using Google /CF DNS (8.8.8.8 1.1.1.1) might have helped by the virtue of allowing for a faster connection which managed under 250 ms but on some networks (e.g. located physically on the other side of the globe ) - this timeout is too short.

I was not able to find any parameter in undici that controls this timeout so I have modified it directly via Node options in bash command export NODE_OPTIONS="--network-family-autoselection-attempt-timeout=500"

You might also want to add this directly in package.json: image

This results in longer timeout visible in DEBUG logs and successful connections: image

Networks layers are a bit over my head, this seams like a reasonable explanation of what is going on, but it would be great if someone could verify this reasoning. I can't also really tell what could be the side effects of longer timeout on production scale applications.

I hope this is helpful, as it took me crazy amount trying to solve this.

gregonarash commented 1 month ago

@mcollina @RealYukiSan I believe I have found the root cause of the most of ETIMOUT across most of the issues.

TLDR Node has a default timeout for selecting/connecting ipv4 or ipv6 address which is 250ms. This default is used by undici and no of the timeout settings in undici allows to modify it.

What solved the issue completely for me was increasing that timeout to: export NODE_OPTIONS="--network-family-autoselection-attempt-timeout=500"

Likely explanation

This error was popping up for me only on Airtable API and only via node fetch/undici, no problem with other APIs, no problem with cURL accessing the same API.

It hard to replicate - because day a few days after original issue started it went away by itself. Same situation with builds no vercel. Likely on some days 250ms is enough on other days with more traffic it was not enough

Here is a screenshot of connecting with NODE_DEBUG=net to Airtable API - the DNS resolves to 10 different IPs half ipv4 half ipv6 and then you can see that attempt time out is set to 250 ms - which on localhost / home network it times out this week 99% of the time. On localhost on my office network it would time out about 50% of the time image image

Some of the solutions suggested on other threads/issues which suggested turning off the ipv6 or using Google /CF DNS (8.8.8.8 1.1.1.1) might have helped by the virtue of allowing for a faster connection which managed under 250 ms but on some networks (e.g. located physically on the other side of the globe ) - this timeout is too short.

I was not able to find any parameter in undici that controls this timeout so I have modified it directly via Node options in bash command export NODE_OPTIONS="--network-family-autoselection-attempt-timeout=500"

You might also want to add this directly in package.json: image

This results in longer timeout visible in DEBUG logs and successful connections: image

Networks layers are a bit over my head, this seams like a reasonable explanation of what is going on, but it would be great if someone could verify this reasoning. I can't also really tell what could be the side effects of longer timeout on production scale applications.

I hope this is helpful, as it took me crazy amount trying to solve this.

mcollina commented 1 month ago

Can you send a PR to add the above to https://github.com/nodejs/undici/tree/main/docs/docs/best-practices? I think it would be valuable for others.

Note that you should be able to customize this at Agent creation time via connectOptions and specifying the autoSelectFamilyAttemptTimeout

https://nodejs.org/api/net.html

https://github.com/nodejs/undici/blob/main/docs/docs/api/Client.md#parameter-connectoptions

gregonarash commented 1 month ago

@mcollina OK! Done https://github.com/nodejs/undici/pull/3738 I hope that looks OK 😥

I added note on ability to adjust in the client directly for people using unbundled.