vercel / next.js

The React Framework
https://nextjs.org
MIT License
124.96k stars 26.69k forks source link

NextJS 14 returns fetch failed with UND_ERR_CONNECT_TIMEOUT error on serverless function #66373

Open andremendonca03 opened 3 months ago

andremendonca03 commented 3 months ago

Link to the code that reproduces this issue

https://codesandbox.io/p/devbox/gifted-shirley-mzlvgy?file=%2Fapp%2Fapi%2Froute.js%3A1%2C1-39%2C1

To Reproduce

  1. From a client-side component, start a fecth POST request to an API endpoint (route handler) on form submission;
  2. On an API serverless function realise another fetch POST request to an external API (in my case I used Slack message API);
  3. On a production environment hosted on Vercel, around 70% of the requests to Slack are working while another 30% fail returning 500 server error code "UND_ERR_CONNECT_TIMEOUT".

Systems:

Full error message: Unhandled Rejection: TypeError: fetch failed at node:internal/deps/undici/undici:12345:11 at process.processTicksAndRejections (node:internal/process/task_queues:95:5) { cause: ConnectTimeoutError: Connect Timeout Error at onConnectTimeout (node:internal/deps/undici/undici:7492:28) at node:internal/deps/undici/undici:7448:50 at Immediate._onImmediate (node:internal/deps/undici/undici:7480:13) at process.processImmediate (node:internal/timers:478:21) at process.callbackTrampoline (node:internal/async_hooks:130:17) { code: 'UND_ERR_CONNECT_TIMEOUT' } } Node.js process exited with exit status: 128. The logs above can help with debugging the issue.

Current vs. Expected behavior

Currently some external API calls from a serverless function are returning unhandled fetch errors. Expected behaviour is no errors being returned and API call succeeding.

Provide environment information

Operating System:
Vercel Servers

Binaries:
Node: v20x (default vercel v20 setting)
npm: 10.2.3
yarn: 1.22.19
build command: yarn build

Relevant Packages: 
next: 14.1.4
react: 18.2.0
react-dom: 18.2.0

Next.js Config:
/** @type {import('next').NextConfig} */
const nextConfig = {
  reactStrictMode: true,
  trailingSlash: true,
  images: {
    remotePatterns: [
      {
        protocol: 'https',
        hostname: 'site.com',
        port: '',
        pathname: '/wp-content/uploads/**',
      },
    ],
  },
}

module.exports = nextConfig;

Which area(s) are affected? (Select all that apply)

Module Resolution, Pages Router, Runtime

Which stage(s) are affected? (Select all that apply)

Vercel (Deployed)

Additional context

Additional information about the issue and more cases can be found at this discussion: https://github.com/vercel/next.js/discussions/57384

icyJoseph commented 3 months ago

This is most likely an Undici error.

Can you try to collect more data about the endpoints that are failing?

andremendonca03 commented 3 months ago

Hey @icyJoseph I only have these 2 APIs mentioned to test at the moment. Haven't Undici been removed from next14?

icyJoseph commented 3 months ago

Hi, well, it's not a Next.js thing, rather Node.js' fetch implementation uses undici at its core.

For a world pre- Node 18 (17 really) Next js did a polyfill with node-fetch, to provide server side fetch, but since Node.js adopted fetch natively, Next.js just doesn't have to anymore.

icyJoseph commented 3 months ago

Maybe you can do an experiment. Does it also fail, if you create a node script, or just open the Node repl, and try to make a fetch request from there?

andremendonca03 commented 3 months ago

I'm pretty sure this issue doesn't happen on local environments, only on live servers but I'll be testing on pure node soon.

I also found these 2 related issues? https://github.com/nodejs/undici/issues/2362 https://github.com/nodejs/undici/issues/1531

ornakash commented 3 months ago

We have the exact same issue here

next@14.2.3

Edit by maintainer bot: Comment was automatically minimized because it was considered unhelpful. (If you think this was by mistake, let us know). Please only comment if it adds context to the issue. If you want to express that you have the same problem, use the upvote 👍 on the issue description or subscribe to the issue for updates. Thanks!

hrc7505 commented 3 months ago

@andremendonca03 See https://github.com/vercel/next.js/discussions/57384#discussioncomment-9545693

andremendonca03 commented 3 months ago

Yeah I'm on that discussion as well. For me though it only happens in production, not local. Which API are you trying to fetch? Is it through a client component or a serverless function? @hrc7505

hrc7505 commented 3 months ago

@andremendonca03

  1. We are getting these issues in server-rendered pages. We are fetching apis and creating pages during build.
  2. Same apis are working in browser during runtime.
  3. I am randomly getting this issue; not every time. Once issue occurs, for rest of the day, it behaves same. But next day again it starts working.
Node: 18.17.1
Nextjs: 14.2.3
osama554 commented 3 months ago

Facing same issue in my project using nextjs 14.

Lersson commented 3 months ago

Facing the same problem with Nextjs 14.2.3 and node 20 under comporate http_proxy network. In build time, Nextjs is trying to fetch some cloudflare addresses

[cause]: ConnectTimeoutError: Connect Timeout Error (attempted addresses: 104.16.24.34:443, 104.16.28.34:443, 104.16.26.34:443, 104.16.2.35:443, 104.16.0.35:443, 104.16.27.34:443, 104.16.31.34:443, 104.16.1.35:443, 104.16.30.34:443, 104.16.25.34:443, 104.16.29.34:443, 104.16.3.35:443)

vinc01100101 commented 3 months ago

Same problem here. All day yesterday, we were getting UND_ERR_CONNECT_TIMEOUT errors only on Vercel's production build attempts. It was working when built locally. We're still on Next13.

However, at 4:30 am, I tried to build it again in production, and everything worked fine.

This is weird. We may still encounter this issue in future builds. Hoping for someone to find a fix for this. I'll also continue to observe.

Err log:

ERROR FETCH ITEMPAGE GETSTATICPATHS: TypeError: fetch failed

at Object.fetch (node:internal/deps/undici/undici:11731:11) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async getStaticPaths (/vercel/path0/.next/server/pages/used-cars/[slug]/[itemPage].js:1471:26) at async buildStaticPaths (/vercel/path0/node_modules/next/dist/build/utils.js:598:33) at async /vercel/path0/node_modules/next/dist/build/utils.js:933:115 at async Span.traceAsyncFn (/vercel/path0/node_modules/next/dist/trace/trace.js:79:20) { cause: ConnectTimeoutError: Connect Timeout Error at onConnectTimeout (node:internal/deps/undici/undici:6869:28) at node:internal/deps/undici/undici:6825:50 at Immediate._onImmediate (node:internal/deps/undici/undici:6857:13) at process.processImmediate (node:internal/timers:476:21) { code: 'UND_ERR_CONNECT_TIMEOUT' } }

vinc01100101 commented 3 months ago

Alright, this suggestion https://github.com/nodejs/undici/issues/1531#issuecomment-1416212916 might be the fix for UND_ERR_CONNECT_TIMEOUT error. After some additional research and debugging, I tried using dns.lookup(), dns.resolve4(), and dns.resolve6() to determine which family (IPv4 or IPv6) the DNS I am accessing supports. It turns out that it supports IPv4.

ornakash commented 3 months ago

Alright, this suggestion nodejs/undici#1531 (comment) might be the fix for UND_ERR_CONNECT_TIMEOUT error. After some additional research and debugging, I tried using dns.lookup(), dns.resolve4(), and dns.resolve6() to determine which family (IPv4 or IPv6) the DNS I am accessing supports. It turns out that it supports IPv4.

Thanks. So what can I do in order to eliminate this issue?

vinc01100101 commented 3 months ago

Thanks. So what can I do in order to eliminate this issue?

@ornakash, I just included NODE_OPTIONS=--dns-result-order=ipv4first in Vercel's environment variables for our project, and everything works fine.

Note that setting --dns-result-order=ipv4first prioritizes IPv4 addresses over IPv6 addresses but does not disregard IPv6 addresses entirely. Both IPv4 and IPv6 addresses are still resolved and included in the results, but IPv4 addresses appear first in the list. This configuration can be useful when you prefer IPv4 connectivity but still want to support IPv6.

ornakash commented 3 months ago

Thanks. So what can I do in order to eliminate this issue?

@ornakash, I just included NODE_OPTIONS=--dns-result-order=ipv4first in Vercel's environment variables for our project, and everything works fine.

Note that setting --dns-result-order=ipv4first prioritizes IPv4 addresses over IPv6 addresses but does not disregard IPv6 addresses entirely. Both IPv4 and IPv6 addresses are still resolved and included in the results, but IPv4 addresses appear first in the list. This configuration can be useful when you prefer IPv4 connectivity but still want to support IPv6.

Thanks! it looks like it drastically reduced the times this happens. We had like 200 errors a day, and now only 6 with UND_ERR_CONNECT_TIMEOUT

I hope they'll understand why it happens so we won't get even 6 errors

vinc01100101 commented 3 months ago

Thanks! it looks like it drastically reduced the times this happens. We had like 200 errors a day, and now only 6 with UND_ERR_CONNECT_TIMEOUT

I hope they'll understand why it happens so we won't get even 6 errors

@ornakash

To see which connectivity the DNS is using for your request, try using:

dns.lookup('example.com', { all: true }, (err, addresses) => {
    console.log({ addresses, err });
});

(Replace example.com with the DNS where you received the 6 errors.)

The logs should show something like:

{
  addresses: [
    { address: '93.184.215.14', family: 4 },
    { address: '2606:2800:21f:cb07:6820:80da:af6b:8b2c', family: 6 }
  ],
  err: null
}
kpratik2015 commented 3 months ago

Linking my resolution here if it helps someone else https://github.com/vercel/vercel/issues/11692#issuecomment-2152859828

rafalzawadzki commented 3 months ago

Same started happening in our project; 5% of requests on Vercel fail due to timeout. Locally they error out with UND_ERR_HEADERS_TIMEOUT after a few minutes. Affects two POST API route handlers that call external services.

Tried setting NODE_OPTIONS=--dns-result-order=ipv4first but didn't seem to help.

Running Next 14.0.4 and Node 18.x

ngroenewold95 commented 3 months ago

I am also getting a bunch of undici errors as well recently. These are the 3 main ones

"next": "^14.2.3", node v20.9.0

I also tried setting vercel env variable: NODE_OPTIONS=--dns-result-order=ipv4first but it has not solved the issue

`TypeError: fetch failed
    at node:internal/deps/undici/undici:12618:11
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: Error: connect ETIMEDOUT 76.76.21.241:443
      at TCPConnectWrap.afterConnect [as oncomplete] (node:net:1555:16)
      at TCPConnectWrap.callbackTrampoline (node:internal/async_hooks:128:17) {
    errno: -110,
    code: 'ETIMEDOUT',
    syscall: 'connect',
    address: '76.76.21.241',
    port: 443
  }`

  `TypeError: fetch failed
    at node:internal/deps/undici/undici:12618:11
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: ConnectTimeoutError: Connect Timeout Error
      at onConnectTimeout (node:internal/deps/undici/undici:7760:28)
      at node:internal/deps/undici/undici:7716:50
      at Immediate._onImmediate (node:internal/deps/undici/undici:7748:13)
      at process.processImmediate (node:internal/timers:476:21)
      at process.callbackTrampoline (node:internal/async_hooks:128:17) {
    code: 'UND_ERR_CONNECT_TIMEOUT'
  }
}`

`TypeError: fetch failed
    at node:internal/deps/undici/undici:12618:11
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  cause: [Error: C0AFB780CE7F0000:error:0A00010B:SSL routines:ssl3_get_record:wrong version number:ssl/record/ssl3_record.c:355:
  ] {
    library: 'SSL routines',
    reason: 'wrong version number',
    code: 'ERR_SSL_WRONG_VERSION_NUMBER'
  }
}`
bacqueyrisses commented 3 months ago

I successfully resolved the issue by configuring the undici global dispatcher in the root layout. CleanShot 2024-06-17 at 20 03 50@2x

ornakash commented 2 months ago

I successfully resolved the issue by configuring the undici global dispatcher in the root layout. CleanShot 2024-06-17 at 20 03 50@2x

Do you have a solution for pages router as well?

ljj0915 commented 2 months ago

I successfully resolved the issue by configuring the undici global dispatcher in the root layout. CleanShot 2024-06-17 at 20 03 50@2x

Hi ~,After I add this configuration, build will report an error next:13.2.3 node:20.11.1

./node_modules/undici/lib/web/fetch/util.js Module parse failed: Unexpected token (682:63) File was processed with these loaders:

Import trace for requested module: ./node_modules/undici/lib/web/fetch/util.js ./node_modules/undici/lib/web/fetch/formdata.js ./node_modules/undici/index.js ./app/layout.tsx

Build failed because of webpack errors

starlight-akouri commented 2 months ago

I am also facing this issue when calling API functions from Zapier:

Unhandled Rejection: TypeError: fetch failed
at node:internal/deps/undici/undici:12502:13
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async p (/var/task/.next/server/app/api/calls/route.js:1:634) {
[cause]: ConnectTimeoutError: Connect Timeout Error (attempted addresses: 54.203.40.250:443)
at onConnectTimeout (node:internal/deps/undici/undici:6635:28)
at node:internal/deps/undici/undici:6587:50
at Immediate._onImmediate (node:internal/deps/undici/undici:6619:13)
at process.processImmediate (node:internal/timers:478:21)
at process.callbackTrampoline (node:internal/async_hooks:130:17) {
code: 'UND_ERR_CONNECT_TIMEOUT'
}
}
Node.js process exited with exit status: 128. The logs above can help with debugging the issue.
abhishekchoure commented 2 months ago

I am also facing the same issue when trying to execute SQL using libsql client (Turso):

There has been an error while retrieving the database type. Debug information:

gnomefin commented 2 months ago

Just give my data point as well, but in this isn't a serverless, just a regular next workload.

It works well when I was using these versions:

"next": "14.1.3",
 "@types/node": "^20.8.4",

But now I am. using next 14.2.4 with node ^20.8.4, this next version doesn't seem indicate to have a backward compatible, or match with some deps like undici

Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:  ⨯ TypeError: fetch failed
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:     at Object.fetch (node:internal/deps/undici/undici:11576:11)
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:     at async fetchExternalImage (/opt/commeasy/oc-web-ui/deploy/1.2.10/node_modules/next/dist/server/image-optimizer.js:565:17)
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:     at async NextNodeServer.imageOptimizer (/opt/commeasy/oc-web-ui/deploy/1.2.10/node_modules/next/dist/server/next-server.js:650:48)
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:     at async cacheEntry.imageResponseCache.get.incrementalCache (/opt/commeasy/oc-web-ui/deploy/1.2.10/node_modules/next/dist/server/next-server.js:182:65)
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:     at async /opt/commeasy/oc-web-ui/deploy/1.2.10/node_modules/next/dist/server/response-cache/index.js:90:36
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:     at async /opt/commeasy/oc-web-ui/deploy/1.2.10/node_modules/next/dist/lib/batcher.js:45:32 {
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:   cause: ConnectTimeoutError: Connect Timeout Error
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:       at onConnectTimeout (node:internal/deps/undici/undici:8522:28)
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:       at node:internal/deps/undici/undici:8480:50
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:       at Immediate._onImmediate (node:internal/deps/undici/undici:8511:13)
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:       at process.processImmediate (node:internal/timers:476:21)
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:       at process.callbackTrampoline (node:internal/async_hooks:130:17) {
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:     code: 'UND_ERR_CONNECT_TIMEOUT'
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]:   }
Jul 10 05:07:56 ip-30-0-103-10 sh[784691]: }
(END)