nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
107.72k stars 29.66k forks source link

Prevent socket destruction when keepAliveTimeout fires #17749

Closed andrsnn closed 6 years ago

andrsnn commented 6 years ago

I'm currently having an issue where a Node service under heavy load behind a load balancer intermittently sends a RST ACK during a Connection: keep-alive request resulting in a 502 on the client (load balancer passes back 502).

Upon further inspection there is a 5 second window where no requests are received from the load balancer. It appears that the keepAliveTimeout default 5 second timeout is hit and Node then closes the connection and destroys the socket. In most cases (99% of requests that time out) Node successfully tears down the connection with a proper TCP teardown handshake between load balancer and Node, however there appears to be an edge case where a request will be sent from the load balancer directly before the TCP teardown occurs which then results in a RST being sent back to the load balancer (and hence 502). This appears to occur because Node issues a close syscall destroying the socket directly before initiating the tear down handshake.

Normal teardown wireshark capture:

236056  2017-12-07 19:51:31.961253  10.23.34.195    10.23.67.39 HTTP    1067    POST /api HTTP/1.1  (application/json)
236057  2017-12-07 19:51:31.961267  10.23.67.39 10.23.34.195    TCP 66  80→52024 [ACK] Seq=12801 Ack=33052 Win=95232 Len=0 TSval=242257 TSecr=1567689
236680  2017-12-07 19:51:32.061090  10.23.67.39 10.23.34.195    HTTP    466 HTTP/1.1 200 OK  (application/json)
236681  2017-12-07 19:51:32.061368  10.23.34.195    10.23.67.39 TCP 66  52024→80 [ACK] Seq=33052 Ack=13201 Win=62464 Len=0 TSval=1567714 TSecr=242282
248801  2017-12-07 19:51:37.067175  10.23.67.39 10.23.34.195    TCP 66  80→52024 [FIN, ACK] Seq=13201 Ack=33052 Win=95232 Len=0 TSval=243533 TSecr=1567714
248807  2017-12-07 19:51:37.067480  10.23.34.195    10.23.67.39 TCP 66  52024→80 [FIN, ACK] Seq=33052 Ack=13202 Win=62464 Len=0 TSval=1568966 TSecr=243533
248808  2017-12-07 19:51:37.067504  10.23.67.39 10.23.34.195    TCP 66  80→52024 [ACK] Seq=13202 Ack=33053 Win=95232 Len=0 TSval=243533 TSecr=1568966

RST Wireshark capture:

221188  2017-12-07 19:51:26.881806  10.23.34.195    10.23.67.39 HTTP    1068    POST /api HTTP/1.1  (application/json)
221189  2017-12-07 19:51:26.881838  10.23.67.39 10.23.34.195    TCP 66  80→52026 [ACK] Seq=7201 Ack=19033 Win=67072 Len=0 TSval=240987 TSecr=1566420
221814  2017-12-07 19:51:27.018326  10.23.67.39 10.23.34.195    HTTP    466 HTTP/1.1 200 OK  (application/json)
221815  2017-12-07 19:51:27.018601  10.23.34.195    10.23.67.39 TCP 66  52026→80 [ACK] Seq=19033 Ack=7601 Win=47360 Len=0 TSval=1566454 TSecr=241021
236528  2017-12-07 19:51:32.042525  10.23.34.195    10.23.67.39 HTTP    1066    POST /api HTTP/1.1  (application/json)
236529  2017-12-07 19:51:32.042538  10.23.67.39 10.23.34.195    TCP 66  80→52026 [ACK] Seq=7601 Ack=20033 Win=69120 Len=0 TSval=242277 TSecr=1567710
236697  2017-12-07 19:51:32.066487  10.23.67.39 10.23.34.195    TCP 66  80→52026 [RST, ACK] Seq=7601 Ack=20033 Win=69120 Len=0 TSval=242283 TSecr=1567710

Sysdig capture taking place at same time of RST capture Application is dockerized and the container exposes port 3000, although results are repeatable outside of docker.

832662 19:51:27.018305861 0 node (2989) > writev fd=120(<4t>10.23.34.195:52026->172.17.0.2:3000) size=400
832663 19:51:27.018341806 0 node (2989) < writev res=400 data=HTTP/1.1 200 OK..Content-Type: application/json; charset=
1004716 19:51:32.066469320 0 node (2989) > close fd=120(<4t>10.23.34.195:52026->172.17.0.2:3000)
1004717 19:51:32.066470002 0 node (2989) < close res=0

Ordered by sequence of events:

19:51:26.881806  10.23.34.195    10.23.67.39 HTTP    1068    POST /api HTTP/1.1  (application/json)
19:51:26.881838  10.23.67.39 10.23.34.195    TCP 66  80→52026 [ACK] Seq=7201 Ack=19033 Win=67072 Len=0 TSval=240987 TSecr=1566420
19:51:27.018305861 0 node (2989) > writev fd=120(<4t>10.23.34.195:52026->172.17.0.2:3000) size=400
19:51:27.018326  10.23.67.39 10.23.34.195    HTTP    466 HTTP/1.1 200 OK  (application/json)
19:51:27.018341806 0 node (2989) < writev res=400 data=HTTP/1.1 200 OK..Content-Type: application/json; charset=
19:51:27.018601  10.23.34.195    10.23.67.39 TCP 66  52026→80 [ACK] Seq=19033 Ack=7601 Win=47360 Len=0 TSval=1566454 TSecr=241021
19:51:32.042525  10.23.34.195    10.23.67.39 HTTP    1066    POST /api HTTP/1.1  (application/json)
19:51:32.042538  10.23.67.39 10.23.34.195    TCP 66  80→52026 [ACK] Seq=7601 Ack=20033 Win=69120 Len=0 TSval=242277 TSecr=1567710
19:51:32.066469320 0 node (2989) > close fd=120(<4t>10.23.34.195:52026->172.17.0.2:3000)
19:51:32.066470002 0 node (2989) < close res=0
19:51:32.066487  10.23.67.39 10.23.34.195    TCP 66  80→52026 [RST, ACK] Seq=7601 Ack=20033 Win=69120 Len=0 TSval=242283 TSecr=1567710

As you can see above, it seems that the timeout fires, a request is received, node closes the socket then responds to the pending request with a RST.

It seems this behavior is probably expected as this is simply a race condition. Are there any lifecyle hooks / lower level of control exposed in order to wait for a FIN ACK from the load balancer?

andrsnn commented 6 years ago

Closed as it seems a race condition would exist even if granular control exists. Seems like the only real solution is to ensure the requesting client is aware of the timeout and idle connection teardown starts on the client.