nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.68k stars 29.64k forks source link

HTTP Server close is taking up to 5 seconds in node 18 #50188

Open DanielRamosAcosta opened 1 year ago

DanielRamosAcosta commented 1 year ago

Version

v18.18.2

Platform

Darwin MacBook-Pro-de-Daniel.local 23.0.0 Darwin Kernel Version 23.0.0: Fri Sep 15 14:43:05 PDT 2023; root:xnu-10002.1.13~1/RELEASE_ARM64_T6020 arm64

Subsystem

http

What steps will reproduce the bug?

  1. Create an http server
  2. Start listening
  3. Take at least one request
  4. Try to close the server
  5. It takes exactly 5 seconds

How often does it reproduce? Is there a required condition?

I have only reproduced the bug in node 18, but is deterministic, it happens every time

What is the expected behavior? Why is that the expected behavior?

In node 19 and 20 it's only taking 0.5 seconds. Thats a normal time that I would expect.

What do you see instead?

I see 5 seconds of delay in order to close

Additional information

I've reproduced the error in this repository with a bare minimum with Github Actions: DanielRamosAcosta/nodejs-close-server-bug

Here is the CI report:

This is affecting to a library I'm maintaining similar to supertest. If you see the CI there, Node 18 tests are taking very long due to the servers being closed.

H4ad commented 1 year ago

~Started on 18.18.1, the 18.18.0 does not have this weird delay.~

EDIT: it has the delay also on 18.18.0

himself65 commented 1 year ago
image

Started on 18.18.1, the 18.18.0 does not have this weird delay.

No, I think all 18 version has such issue

himself65 commented 1 year ago

Related PR: https://github.com/nodejs/node/pull/48383

himself65 commented 1 year ago

The hot fix:

import * as http from "node:http";
import { promisify } from "node:util";

const server = http.createServer(async (req, res) => {
    res.writeHead(200, { "Content-Type": "text/plain" });
    res.write("Hello world");
    res.end();
});
const listenPromisied = promisify(server.listen.bind(server));
const closePromisied = promisify(server.close.bind(server));

await listenPromisied(0, "127.0.0.1");
const address = server.address();
await fetch(`http://${address.address}:${address.port}`);
console.time("server close");
+ server.closeAllConnections();
await closePromisied();
console.timeEnd("server close");
H4ad commented 1 year ago

Should we bisect or closeAllConnections is expected to be called?

himself65 commented 1 year ago

The latest behavior is to close all sockets, I think there are some backport issues on 18. Let me see if I can do the backport

H4ad commented 1 year ago

The 19.0.0 looks like the first release without this delay.

I will try bisect to see if I can find the commit that fixes the delay and then we can backport.

The #48383 was released on 20.4.0, and this issue started on 18.0.0 (or even earlier, I didn't try rewrite the fetch call)

H4ad commented 1 year ago

@himself65 bisect will not help in this case:

The merge base 19064bec341185a8c15fc438cfcf0df8633a179e is bad.
This means the bug has been fixed between 19064bec341185a8c15fc438cfcf0df8633a179e and [cc993fb2760d01457955f5b9ff787d559ed1c34e].

From what I understand, 18.x and 19.x have different git histories, so there's no way to find the bad commit using bisect. At least, I don't know a way to do it, I'm open to suggestions.

himself65 commented 1 year ago

I tried nodejs 16.x. now I think there might have a regression on nodejs 16 -> 18

import * as http from "node:http";
import { promisify } from "node:util";
import fetch from 'node-fetch'

const server = http.createServer(async (req, res) => {
    res.writeHead(200, { "Content-Type": "text/plain" });
    res.write("Hello world");
    res.end();
});
const listenPromisied = promisify(server.listen.bind(server));
const closePromisied = promisify(server.close.bind(server));

await listenPromisied(0, "127.0.0.1");
const address = server.address();
await fetch(`http://${address.address}:${address.port}`);
console.time("server close");
await closePromisied();
console.timeEnd("server close");
➜  nodejs git:(main) ✗ node index.mjs
server close: 0.158ms
➜  nodejs git:(main) ✗ node -v
v16.20.2
H4ad commented 1 year ago

It's a regression from 17.x -> 18.x

h4ad:node-copy-4/ (main✗) $ node test-close-2.mjs                                                                                                                                                                                                                    
server close: 0.108ms
h4ad:node-copy-4/ (main✗) $ node -v
v17.9.1
H4ad commented 1 year ago

Based on https://github.com/nodejs/node/pull/49091/files#diff-d692ac4524379ec6a1201165e8ff8d3267c8130e07014e8221ebf7e6f80c6641R1560-R1567, and this little change:

const server = http.createServer({ keepAliveTimeout: 200 }, async (req, res) => {
h4ad:node-copy-4/ (main✗) $ node test-close-2.mjs
server close: 0.255ms
h4ad:node-copy-4/ (main✗) $ node -v
v18.18.1

@himself65 I think you are right, the #48383 should fix the issue, I thought it landed on 18.18.0 but it did not, sorry for the confusion.

Let me know if you will do the backport, if not, I can do that.

H4ad commented 1 year ago

I built 18.18.0 with fix #48383, it solves the delay.

himself65 commented 1 year ago

Based on https://github.com/nodejs/node/pull/49091/files#diff-d692ac4524379ec6a1201165e8ff8d3267c8130e07014e8221ebf7e6f80c6641R1560-R1567, and this little change:

const server = http.createServer({ keepAliveTimeout: 200 }, async (req, res) => {
h4ad:node-copy-4/ (main✗) $ node test-close-2.mjs
server close: 0.255ms
h4ad:node-copy-4/ (main✗) $ node -v
v18.18.1

@himself65 I think you are right, the #48383 should fix the issue, I thought it landed on 18.18.0 but it did not, sorry for the confusion.

Let me know if you will do the backport, if not, I can do that.

I might don't have time on backport you can try that.

I read some CIGTM code, i have no idea on this https://github.com/nodejs/node/pull/48383#issuecomment-1716191991

DanielRamosAcosta commented 1 year ago

I think I can bring more insights for the problem. I've tried dierent ways of fetching to the server in different node versions. Basically I'm trying with the native fetch (native), with node-fetch (library) and with an old plain request from the http package (request). Here are the results (you can also find them in Github Actions):

As you can see, there seems to be a problem with fetch in node 18.x, but solved since 19

himself65 commented 1 year ago

I manually implement a function that won't close the connections. Now It's back to 5s in node.js 19/20

import * as http from "node:http";
import { promisify } from "node:util";
import * as net from "node:net";

const server = http.createServer(async (req, res) => {
  res.writeHead(200, { "Content-Type": "text/plain" });
  res.write("Hello world");
  res.end();
});

const listenPromisied = promisify(server.listen.bind(server));
const closePromisied = promisify(net.Server.prototype.close.bind(server));

await listenPromisied(0, "127.0.0.1");
const address = server.address();
await fetch(`http://${address.address}:${address.port}`);
console.time("server close");
await closePromisied();
console.timeEnd("server close");
himself65 commented 1 year ago

Now I believe this is a kind of bug in undici

himself65 commented 1 year ago

I think this is not a regression. From undici 4.4.1 (first version that has fetch), the delay is always there

himself65 commented 1 year ago

Hotfix

res.writeHead(200, { 'Content-Type': 'text/plain', 'Connection': 'close' })
himself65 commented 1 year ago

Thanks to https://github.com/nodejs/undici/issues/2348#issuecomment-1763951248

The real reason is keep-alive is by default in somewhere

himself65 commented 1 year ago

https://github.com/nodejs/node/blob/709e368708235f89b21372a9c6bd49df68e5bd41/lib/_http_outgoing.js#L125

https://github.com/nodejs/node/blob/356b4a268a6f20c11dc79ed841c53020c00b8872/lib/_http_agent.js#L557

benjamingr commented 1 year ago

The code fetches and then does not read or dispose of the response. This is expected behavior I think? You only waited for headers, you might still read the body of the response.