denoland / deno

A modern runtime for JavaScript and TypeScript.
https://deno.com
MIT License
97.09k stars 5.36k forks source link

Deno runs out of file descriptors when put under heavy load #15281

Closed umgefahren closed 1 month ago

umgefahren commented 2 years ago

This issue is the follow up to #15124.

We are currently developing a server benchmark for various languages and runtimes. While developing our stress-tester we discovered a bug / possible improvement in Deno 1.24.0.

When putting heavy load on Deno (steps to reproduce follow) it throws:

error: Uncaught (in promise) Error: Too many open files (os error 24)
  for await (const conn of listener) {
                   ^
    at async Listener.next (deno:ext/net/01_net.js:214:16)
    at async start (file:///Users/user/drafts/server-language-benchmark/Deno/src/server.ts:37:20)
    at async file:///Users/user/drafts/server-language-benchmark/Deno/mod.ts:3:1

Analysis

I'm not very familiar with the Deno codebase, but I tried to pinpoint the error. When printing the Resource Ids (which probably somewhat map to the file descriptor number on Unix (like) systems) I observed that they only increase during the test run. This indicates that the resource id numbers are not reused, which, if they in fact directly map to file descriptor numbers, means that dead connections might not be closed fast enough. I'm pretty sure that these connections have been shut down, since the client only maintains a maximum of 64 concurrent connections on my system, dropping old ones.

The issue might be somewhere hidden in the TCP stack, although I'm definitely not certain. Note that we did some unholy things with TCP in order to put such a high load on the server (https://github.com/umgefahren/server-language-client/blob/bc5eaf085f399cb76a4aa1e796696c5fd274c360/src/worker.rs#L101). I.e. we flushed, set the lingering time to 1ms and then immediately shut down the connection.

Instructions to reproduce

The Deno server code we ran can be found here: https://github.com/umgefahren/server-language-benchmark/tree/main/Deno

The code of the client can be found here: https://github.com/umgefahren/server-language-client

Steps to reproduce the issue with the client:

Build the client (in release mode) cargo build --release Generate data with ./target/release/server-language-client generate 10000000 Run the benchmark (after starting the deno server) ./target/release/server-language-client benchmark 1m

kitsonk commented 2 years ago

Why are you expecting it to not run out of file descriptors? They are a limited resource, constrained by the host operating system.

0f-0b commented 2 years ago

If I understand correctly, the client will send a RST flag when connection termination times out, which is very likely to happen since the timeout is set to 1ms. In this case conn.read on the server will throw a ConnectionReset error leaving conn open, resulting in a resource leak.

piscisaureus commented 2 years ago

Tangential: resource IDs do not map 1:1 to file descriptors and are expected to only go up.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

lucacasonato commented 1 month ago

We have not received other reports of this issue in 2.5 years, so I am going to close this.