nodejs / node-v0.x-archive

Moved to https://github.com/nodejs/node
34.42k stars 7.31k forks source link

Connections closed by node stay permanently in FIN_WAIT2 #3613

Closed ghost closed 1 year ago

ghost commented 12 years ago

Hi

I love using node :)

Normally when node closes a connection (using socket.end()), the following packets are exchanged: node > client [AF] client > node [A] client > node [AF] node > client [A]

However, sometimes the client does not respond well, and I get only these packets: node > client [AF] client > node [A]

With node, this results in connections permanently (not temporarily) in FIN_WAIT2 (netstat -nop): Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name Timer tcp 0 0 XX.XX.XX.XX:80 XX.XX.XX.XX:17863 FIN_WAIT2 26375/node off (0.00/0/0) tcp 0 0 XX.XX.XX.XX:80 XX.XX.XX.XX:17832 FIN_WAIT2 26375/node off (0.00/0/0) [very long list] tcp 0 0 XX.XX.XX.XX:80 XX.XX.XX.XX:17992 FIN_WAIT2 26375/node off (0.00/0/0) tcp 0 0 XX.XX.XX.XX:80 XX.XX.XX.XX:17851 FIN_WAIT2 26375/node off (0.00/0/0)

The problem here is that the timer is set to off, so the socket stays indefinitely in this state, eventually exhausting the server.

I wanted to know why I see this problem only with node (and not with nginx and perl POE). I read the following interesting information (source http://alldunne.org/2010/07/tcp-connection-is-permanently-in-fin_wait2-state/): "When programming sockets in C, it’s useful to know an important difference between the shutdown (either READ or WRITE) and close of a socket. Shutting down the socket will cause the connection to go into the FIN_WAIT2 state while the TCP/IP stack waits for the connection to be closed at the other end of the connection. But if the other side doesn’t shutdown/close the connection, then the connection will remain in the FIN_WAIT2 state indefinitly (assuming of course that your process hasn’t been terminated). If your process where to close the connection, rather than call shutdown on it, then the connection will fall into FIN_WAIT2 as before, however, the TCP/IP stack will time the connection out after a certain period."

Having read this, maybe node is shutting down the socket instead of closing it and causes the socket to exist forever, when it was not closed well on the client side.

Best regards

bnoordhuis commented 12 years ago

I've never seen that behavior but it sounds plausible. We sometimes shut down the socket before closing it. Maybe there's a code path somewhere that shuts down but fails to close the file descriptor.

What do you see when you run strace -re shutdown,close -p $(pidof node)?

ghost commented 12 years ago

Thank you for your feedback. I will provide you this info next week, because my co-programmer went on holiday today.

ghost commented 12 years ago

If I understand the strace below well, node only does a shutdown when calling socket.end(). If the assumption is true, that "the connection will remain in the FIN_WAIT2 state indefinitly" when doing a shutdown instead of a close, then a simple DOS attack can exhaust a server (calling a node server many times, let node disconnect, but with iptables blocking sending a return FIN).

strace -e shutdown,close -p 19751

Process 19751 attached - interrupt to quit close(23) = 0 shutdown(77, 1 /* send /) = 0 --- SIGPIPE (Broken pipe) @ 0 (0) --- close(77) = 0 shutdown(77, 1 / send /) = 0 shutdown(90, 1 / send /) = 0 shutdown(111, 1 / send /) = 0 shutdown(127, 1 / send /) = 0 shutdown(155, 1 / send /) = 0 shutdown(142, 1 / send /) = 0 shutdown(156, 1 / send */) = 0 ^CProcess 19751 detached

A temporary workaround is to enable keep-alive when the connection was opened, because then the FIN_WAIT2 will timeout after 2 hours. Normally it should be 1 minute, but this is better than infinite.

bnoordhuis commented 12 years ago

Do you see corresponding close() calls, i.e. a close(42) some time after shutdown(42)? I guess the answer must be yes or you'd run out of file descriptors eventually.

ghost commented 12 years ago

No shutdown after close of visa versa:

Process 30244 attached - interrupt to quit
16:09:34.489649 close(176)              = 0
16:10:23.627035 close(172)              = 0
16:12:14.821779 shutdown(54, 1 /* send */) = 0
16:12:16.361595 close(54)               = 0
16:12:19.475044 close(190)              = 0
16:12:44.836160 shutdown(109, 1 /* send */) = 0
16:12:46.961546 close(109)              = 0
16:13:20.848573 shutdown(173, 1 /* send */) = 0
16:14:20.869405 shutdown(184, 1 /* send */) = 0
16:15:24.906139 shutdown(172, 1 /* send */) = 0
16:18:39.997952 shutdown(195, 1 /* send */) = 0
16:18:41.538199 close(195)              = 0
16:19:18.377569 close(203)              = 0
16:21:38.181774 close(28)               = 0
16:22:09.202133 close(214)              = 0
16:24:45.386578 close(168)              = 0
16:24:57.199691 close(225)              = 0
16:25:42.173282 shutdown(216, 1 /* send */) = 0
16:25:55.816329 close(81)               = 0
16:25:58.082459 close(224)              = 0
16:26:44.191287 shutdown(28, 1 /* send */) = 0
16:27:14.094205 close(193)              = 0
16:27:43.219924 shutdown(223, 1 /* send */) = 0
16:29:14.252537 shutdown(226, 1 /* send */) = 0
^C

The result in netstat is:
#netstat -otn | grep -v ESTA
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       Timer
tcp        0      1 XX.XX.XX.XX:2001     XX.XX.XX.XX:5896     FIN_WAIT1   on (25.26/5/0)
tcp        0      1 XX.XX.XX.XX:2001     XX.XX.XX.XX:4965      FIN_WAIT1   on (3.56/11/0)
tcp        0      1 XX.XX.XX.XX:2001     XX.XX.XX.XX:4927      FIN_WAIT1   on (83.44/6/0)
tcp        0      1 XX.XX.XX.XX:2001     XX.XX.XX.XX:8133      FIN_WAIT1   on (7.18/2/0)
tcp        0      1 XX.XX.XX.XX:2001     XX.XX.XX.XX:7563      FIN_WAIT1   on (84.27/5/0)
tcp        0      1 XX.XX.XX.XX:2001     XX.XX.XX.XX:6799     FIN_WAIT1   on (63.97/12/0)
tcp        0      1 XX.XX.XX.XX:2001     XX.XX.XX.XX:7900     FIN_WAIT1   on (67.56/11/0)
bnoordhuis commented 12 years ago

I see at least a few matching shutdown/close syscalls and, realistically speaking, it must happen or you'd run out of file descriptors before too long (unless you set ulimit -n extremely high).

Do you perhaps have a test case I can try?

ghost commented 12 years ago

I have set the open files descriptors very high indeed (I am handling close to 100.000 M2M devices). This is the perfect test case for node, because with mobile devices everything can go wrong, like disappearing without closing the connection (resulting in FIN_WAIT1), not replying to a FIN from the server (resulting in FIN_WAIT2).

The last list I sent to you is the strace from the start of the node application for 15 minutes and the netstat is taken at the end of the 15 minute sample.

Is there a way to check the number of opened file descriptors in node, so I can check there is a "leak"?

I guess three test cases would be: 1) Client connects to node server, client is physically disconnected from the connection, node server tries to send something. 2) Client connects to node server, client is physically disconnected from the connection, node server tries to disconnect with socket.end(). This results in a FIN_WAIT1. 3) Client connects to node server, client has iptables set in such a way that a packet with FIN is never sent out, node server tries to disconnect. This results in a FIN_WAIT2.

bnoordhuis commented 12 years ago

Is there a way to check the number of opened file descriptors in node

Yes, ls /proc/$(pidof node)/fd/

Can you check with sysctl net.ipv4.tcp_fin_timeout what the FIN timeout is set to? If it's not 60 seconds or less, please lower it and see if the problem goes away.

ghost commented 12 years ago

Today node crashed because I was out of file descriptors. I noticed that there is no leak. The number of fd is as expected.

I noticed that if keepalive is set, it takes tcp_keepalive_time instead of tcp_fin_timeout when in FIN_WAIT2. So by default it is 7200 seconds (2 hours) even if tcp_fin_timeout is set to the default 60 seconds. If however keepalive is NOT set, then the socket goes into FIN_WAIT2 infinitely when the socket is not closed properly from the client side. This happens when node does first shutdown, then close. In my opinion this is a bug in node which can be exploited easily by a DOS attack.

Today node crashed (100% CPU) when it ran out of fd. The reason is that a lot of M2M devices that I manage don't return a FIN when node closes the connection. I tuned some parameters in /etc/sysctl and turned on socket keepalive. Now I have about 10.000 sockets in the FIN_WAIT2 state, waiting to time out. So this is a good workaround, but still sensitive for DOS attacks.

bnoordhuis commented 12 years ago

Today node crashed because I was out of file descriptors. I noticed that there is no leak. The number of fd is as expected.

Okay, thanks. That means it's not specifically a node bug (by which I mean that any TCP server is vulnerable to this particular weakness of the protocol, not just node.)

I'm not sure how or even if we should fix this. The shutdown(SHUT_WR) that node does (half-close, stop sending but don't stop receiving just yet) is there to give the other end a chance to properly shut down the connection. The problem with doing a full close is that lots of client applications don't handle that well at all if it's unexpected.

"Well, that's a bug in the client application," you say (and you'd be right). Yes, but you don't have to sift through the myriads of bug reports that changes like these generate. :-)

I half-kid, obviously, but you see the dilemma - it's going to break stuff that worked before (even if we make it opt-in - because there's always someone.)

ghost commented 12 years ago

Thanks for your feedback.

I am not an expert in how to exactly 'end' a connection, but when I opened this issue I referred to one of the sources I found, saying you should always close a socket, not shutdown a socket.

As far as I know, the other side does not know if you did a shutdown or close because in both situations a FIN is sent. And a FIN is expected back (and if not, no problem because it will timeout in FIN_WAIT2).

Since allowHalfOpen: false is the default setting, maybe then you can close the socket, otherwise shutdown?

I love node.js and I just want to contribute a potential problem which can be exploited :)

bnoordhuis commented 12 years ago

I discussed it with other committers. The consensus is that removing the shutdown() if allowHalfOpen is not set is the best way to go forward. The problem is that it breaks a fair number of tests - no doubt it'll affect real world code as well.

Ergo, don't expect it to land in a stable branch anytime soon - v0.10 at the soonest.

indutny commented 12 years ago

About, close() vs shutdown() after some investigation of linux code I can say that it's not nearly the same thing. close() will send RST frame in many cases.

  When an application closes a connection in such a way that it can
  no longer read any received data, the TCP SHOULD, per section
  4.2.2.13 of RFC 1122, send a RST if there is any unread received
  data, or if any new data is received. A TCP that fails to do so
  exhibits "Failure to RST on close with data pending".
indutny commented 12 years ago

So I propose following solution to this problem: https://github.com/joyent/libuv/pull/498 . And we may add .forceClose() method to node.js which will us uv_tcp_reset_timed(...)

trevnorris commented 11 years ago

Is this a duplicate or similar problem to #664?

soplwang commented 11 years ago

Hi @bnoordhuis, we facing the same problem... I've studied the new streams2 branch and master, the comment of onSocketFinish() says will destroy when allowHalfOpen is false. But currently the implementation of onSocketFinish() not... May be forget this issue?

This issue is very serious for us because lots of 3rd party node.js modules use end() to free the TCP connection instead of destroySoon()...

Pls give it more attention..., tks a lot!

soplwang commented 11 years ago

With a lot study on libuv and linux kernel source, I think shutdown() with SHUT_RDWR instead of SHUT_WR on no-allow-half-open situation may more comfortable. With SHUT_RDWR, shutdown() can produce 'EOF' to readable stream to generate end event then gracefully full shutdown the stream.

With this change, I've passed nearly all tests except 3 relate to http module tests on 0.8.16 release-mod.

bnoordhuis commented 11 years ago

With a lot study on libuv and linux kernel source, I think shutdown() with SHUT_RDWR instead of SHUT_WR on no-allow-half-open situation may more comfortable. With SHUT_RDWR, shutdown() can produce 'EOF' to readable stream to generate end event then gracefully full shutdown the stream.

I don't quite understand what you mean by that. What difference does it make? If you don't allow half-open connections, you might as well close the connection outright.

(Note: I'm quite familiar with libuv and the kernel source; you can just point me to the relevant bits if you want.)

soplwang commented 11 years ago

@bnoordhuis tks for quick reply.

If you don't allow half-open connections, you might as well close the connection outright.

Errrh.., I mentioned before, we use lots 3rd party libraries, such as socket.io, ws. These libraries all use socket.end() to terminate normal connection. When our node.js server long run with our mobile-app, there leaks lots FIN_WAIT2 connections and eat up memory.

If I force close these libraries' management sockets by self, I need enum all defunct connections and destroy() them, an ugly workaround.

So, I want to fix this in node.js core.

The real problem is socket.end() only shutdown(..., SHUT_WR) at _libuv/src/unix/stream.c::uv__drain()_. This only shutdown write side and leaves read side open until receive a FIN later. With terrible network-link, this FIN may lost and leaves this connection always live but stay at FIN_WAIT2 state, leaks...

I tried place a destroy() call at node/lib/net.js::afterShutdown() yesterday (if allowHalfOpen option is false), but that breaks lots tests. After a bit study on test-case, I found them depends on end() and later the end event...

So, place destroy() in afterShutdown() is not acceptable. destroy() will not emit the end event...

When study on shutdown(), I found it's SHUT_RDWR option may shutdown the connection's read and write side both. With SHUT_RDWR instead of SHUT_WR, later epoll() or read() can get a EOF immediate. That makes read side a chance to emit end event and make tests pass.

Use SHUT_RDWR, the behavior likes after end(), flush out all queued data and shutdown write side of connection and, immediate shutdown the read side to emulate a EOF(or FIN) without wait another side anymore.

Yeah, this new behavior should only apply on TCP connection and allowHalfOpen option is false.

(Note: lots library use end() instead of destroySoon() to terminate connection. Maybe it's bad practice, but they stay at there...)

(Ps: According http://lxr.free-electrons.com/source/net/ipv4/tcp.c#L486, when RCV_SHUTDOWN flag is on, the poll will got a POLLRDHUP.)

(Ps2: I'd tested on SHUT_RDWR option standalone, it works like my expect.)

tks! And pls take a look at my change-set: https://github.com/HoopCHINA/node/commit/2a70ccbbeb80475272d56d415edc9167cab42f4e .

soplwang commented 11 years ago

@bnoordhuis With more discuss with my colleagues, we found more compatible way to fix this problem. As @githob suggests, the TCP keep-alive function is more suitable of this prob. The real problem is terrible network-link, not node/application code. So the keep-alive/timeout solution may more simple and suitable.

Pls take a look at https://github.com/HoopCHINA/node/commit/01b982d4e7c9fae98d1b59be46a81e4b8a699e7e changeset. The only change on node v0.8.16 release is lib/net.js::socket.afterShutdown(), enable TCP keep-alive after shutdown when allowHalfOpen is false. No tests break...

tks a lot!

ghost commented 11 years ago

Thanks for the discussion. I see my suggestion of using the TCP keep-alive timeout as a work-around, although I tend to agree that technically @bnoordhuis is right about that it is not a node issue.

But the beauty of node is for example that you can write to a socket and when the buffers are full (not a node issue), it automagically buffers in user space (of course you need to be careful not to overflow that).

I run node with 100.000+ mobile connections and FIN_WAIT2 is really an issue (not a node issue). The beauty of node could maybe apply here as well, so that it automagically destroys connections when appropriate.

I wrote "could" because the discussion between @soplwang and @bnoordhuis is too technical for me, so I don't know the implications.

Thanks

soplwang commented 11 years ago

Hi @githob. Actually, I consider it's a node issue. Other node's stream use stream.end() to terminate well, but TCP based stream not. At least, node's graceful shutdown model makes the problem more terrible. And, even node's stream.pipe() uses end()...

Keep-alive is the simple solution to face bad network-link. We even can use socket.setKeepAlive() to set a short period to make cleanup faster.

@bnoordhuis We consider after socket.end(), keep-alive should be enabled force (allow or disallow half open). Because after shutdown the write side, no other reliable way to check another side alive...

On streams2, the direction "If allowHalfOpen is false, ...then destroy, instead of shutdown" of socket.end() may destroy node/libuv graceful shutdown behavior and not suggest.

Ps: Without fix this in node core, we current use monkey-patch to force enable keep-alive after shutdown to solve this problem.

Ps2: Pls take a look at https://github.com/HoopCHINA/node/commit/4df0c77c35e741ef0f3b06dd277d37491aad6315 and https://github.com/HoopCHINA/node/commit/c3e3a6707e9cb7c30ea144cae2bbfbabc2e5ab46 changeset.

Thanks.

jameshartig commented 11 years ago

@soplwang you can try something like: https://github.com/fastest963/libuv/commit/4f1098fe4dd92dc4e7b50caf1ce00019cf7e1c55 and https://github.com/fastest963/node/commit/34562e722029d1d1016c8ff0f5ef372ca99e1dc4

JiDW commented 10 years ago

Hello,

We have the same issue here. I wanted to patch our nodejs server with HoopCHINA patch, but it has been deleted.

Can someone tell me how to fix this issue until the official files are updated please ?

At this time, I have to reboot our server every week to cleanup the FIN_WAIT2 and I can only hope that nobody will figure how to DDOS us with a single computer... I don't understand why this critical issue isn't fixed yet tbh, I'm having a hardtime convincing my boss not to switch to another tech like java. Why this issue isn't affecting more people ? Do people just set a stupidly high amount of file descriptors ? I don't want to do this, there is a limit for a reason.

Thanks for your help !

soplwang commented 10 years ago

Try 'node-ka-patch' package from HoopCHINA folks.

发自我的 iPhone

在 2014年6月2日,23:42,JiDW notifications@github.com 写道:

Hello,

We have the same issue here. I wanted to patch our nodejs server with HoopCHINA patch, but it has been deleted.

Can someone tell me how to fix this issue until the official files are updated please ?

At this time, I have to reboot our server every week to cleanup the FIN_WAIT2 and I can only hope that nobody will figure how to DDOS us with a single computer... I don't understand why this critical issue isn't fixed yet tbh, I'm having a hardtime convincing my boss not to switch to another tech like java. Why this issue isn't affecting more people ? Do people just set a stupidly high amount of file descriptors ? I don't want to do this, there is a limit for a reason.

Thanks for your help !

— Reply to this email directly or view it on GitHub.

soplwang commented 10 years ago

This issue only affect on bad networks like mobile. But real reason is many packages use 'socket#end()' instead 'socket#destroySoon()' to close socket stream but without a timeout.

Not node's issue, Exactly.

发自我的 iPhone

在 2014年6月2日,23:42,JiDW notifications@github.com 写道:

Hello,

We have the same issue here. I wanted to patch our nodejs server with HoopCHINA patch, but it has been deleted.

Can someone tell me how to fix this issue until the official files are updated please ?

At this time, I have to reboot our server every week to cleanup the FIN_WAIT2 and I can only hope that nobody will figure how to DDOS us with a single computer... I don't understand why this critical issue isn't fixed yet tbh, I'm having a hardtime convincing my boss not to switch to another tech like java. Why this issue isn't affecting more people ? Do people just set a stupidly high amount of file descriptors ? I don't want to do this, there is a limit for a reason.

Thanks for your help !

— Reply to this email directly or view it on GitHub.

darinspivey commented 10 years ago

@JiDW I'm having this issue too! So frustrating. I have a beautiful framework built that cannot be moved forward until this is fixed. My symptom is 100% CPU while it tries to talk to the unclosed FDs. Is there any update on this? I suppose I'll look at node-ka-patch.

Hopefully someone will reply to my detailed ticket in https://github.com/sockjs/sockjs-node/issues/161 ?

update: the patch did not work for me. Perhaps it's not the exact same problem, but it sounds quite similar!

FIXED: I humbly admit that I did not upgrade node.js sooner. I was naive to think v0.10.5 was not that far out of date. Wrong. By over a year. An upgrade to v0.10.29 fixed my issue of unclosed FDs.

yorickvP commented 10 years ago

I can confirm that this problem still occurs for the rest of us in v0.10.29.

It seems likely that, if these devices can accidentally leave open files by dropping off the net at the wrong moment, an attacker could do it on purpose in order to DoS any node application using socket.end(), which seems like a very high-priority problem, being left open for more than two years now.

indutny commented 10 years ago

@yorickvP from what I do remember, the only way to make the sockets really die is to set linger=0 on them. But this means really non-graceful connection.

Here is the code from nginx that does it:

            linger.l_onoff = 1;
            linger.l_linger = 0;

            if (setsockopt(r->connection->fd, SOL_SOCKET, SO_LINGER,
                           (const void *) &linger, sizeof(struct linger)) == -1)
            {
                ngx_log_error(NGX_LOG_ALERT, log, ngx_socket_errno,
                              "setsockopt(SO_LINGER) failed");
            }

I think we could possibly call this on connections that has emitted ETIMEDOUT, or even do it in libuv? cc @saghul

dougwilson commented 10 years ago

I think we could possibly call this on connections that has emitted ETIMEDOUT

This seems like a good compromise.

saghul commented 10 years ago

I think we could possibly call this on connections that has emitted ETIMEDOUT, or even do it in libuv? cc @saghul

I'll take a look tomorrow.

saghul commented 10 years ago

I took a quick look at this, and well, looks like setting SO_LINGER to zero would actually solve this particular problem. Funny enough, windows has SO_DONTLINGER, which does exactly what you think it does.

I would only expose setting SO_LINGER to zero however, and using some application level timer, because apparently a non-blocking socket with SO_LINGER set to some value would block on close(). Maybe dome uv_tcp_no_linger, uv_tcp_linger or uv_tcp_dont_linger.

Now, I guess that the following could be done for node's socket.end() (take it with a grain of salt, my node-fu is not so strong).

Of course, I might be missing something super-obvious here :-/

saghul commented 10 years ago

Here is some interesting read on the matter by Bert Hubert, author of PowerDNS: http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable

soplwang commented 10 years ago

@saghul Not suggest linger, but leave it by kernel TCP retries and timeouts. My experiment of this issue of Node is TCP#shutdown() but remote side not reliable TCP#shutdown() well for us (mobile network, etc.), then Node will keep half-open state and underly socket will stay in FIN_WAIT2 state.

If we use Stream#end() to close TCP stream for consists (Stream#pipe() does this) instead of TCPStream#destroySoon(), we need turn on TCP keepalive after shutdown or add our timeout logic through TCPStream#setTimeout() to detect the dead peer.

Ref1: FIN_WAIT2 [https://kb.iu.edu/d/ajmi] Ref2: tcp_fin_timeout [https://www.frozentux.net/ipsysctl-tutorial/chunkyhtml/tcpvariables.html#AEN370] Ref3: tcp_retries2 [https://www.frozentux.net/ipsysctl-tutorial/chunkyhtml/tcpvariables.html#AEN444] Ref4: tcp_max_orphans [https://www.frozentux.net/ipsysctl-tutorial/chunkyhtml/tcpvariables.html#AEN388]

saghul commented 10 years ago

@soplwang what I suggested is to use proper shutdown() together with an application level timeout, which would then result in close() being called with no lingering, thus (hopefully) guaranteeing that the socket is immediately closed and won't remain in FIN_WAIT2 state.

soplwang commented 10 years ago

@saghul Yeah…, actually I set application level timeout as sub-optimal solution of that comment…

This issue is about link-level problem (e.g., mobile network…), so I suggest solve it with link-level solution, i.e., TCP keep-alive.

Maybe we don't need fix this in Node but mark TCP stream unreliable and leave this to module/app developers. For old problematic module/apps, a keepalive monkey-patch may sufficient.

Ref: Quick keepalive patch [https://github.com/soplwang/node-ka-patch].

jasnell commented 9 years ago

@trevnorris @saghul ... do either of you happen to recall the status of this one?

luckydrq commented 7 years ago

Thanks guys. I am recently writing a tcp client and as i dive into TCP details i found this issue. I wonder if this is fixed in modern nodejs versions, so i move this issue to https://github.com/nodejs/node/issues/11572 for further discussion. I'd be very happy to hear from you.