selfboot / AnnotatedShadowSocks

Annotated shadowsocks(python version)
Other
3 stars 1 forks source link

Listen backlog queue size #18

Open selfboot opened 7 years ago

selfboot commented 7 years ago

In #15 and #16 , there are calls to the server.listen() method. The parameter backlog for this method is usually described as the limit for the queue of incoming connections. When multiple clients would like to connect to the server, then server holds the incoming requests in a queue. The clients are arranged in a queue, and the server processes their requests one by one as and when queue-member proceeds. This nature of connection is called queued connection.

Because of the 3-way handshake used by TCP, an incoming connection goes through an intermediate state SYN RECEIVED before it reaches the ESTABLISHED state and can be returned by the accept syscall to the application. Each time the program calls accept() on the server socket, one of the connections is removed from the queue and that slot can be used for another incoming connection.

backlog queue

This means that a TCP/IP stack has two options to implement the backlog queue for a socket in LISTEN state:

  1. The implementation uses a single queue, the size of which is determined by the backlog argument of the listen syscall. When a SYN packet is received, it sends back a SYN/ACK packet and adds the connection to the queue. When the corresponding ACK is received, the connection changes its state to ESTABLISHED and becomes eligible for handover to the application. This means that the queue can contain connections in two different state: SYN RECEIVED and ESTABLISHED. Only connections in the latter state can be returned to the application by the accept syscall.
  2. The implementation uses two queues, a SYN queue (or incomplete connection queue) and an accept queue (or complete connection queue). Connections in state SYN RECEIVED are added to the SYN queue and later moved to the accept queue when their state changes to ESTABLISHED, i.e. when the ACK packet in the 3-way handshake is received. As the name implies, the accept call is then implemented simply to consume connections from the accept queue. In this case, the backlog argument of the listen syscall determines the size of the accept queue.

Historically, BSD(Such as MAC OS) derived TCP implementations use the first approach. That choice implies that when the maximum backlog is reached, the system will no longer send back SYN/ACK packets in response to SYN packets. Usually the TCP implementation will simply drop the SYN packet (instead of responding with a RST packet) so that the client will retry.

On Linux, things are different, as mentioned in the man page of the listen syscall:

The behavior of the backlog argument on TCP sockets changed with Linux 2.2. Now it specifies the queue length for completely established sockets waiting to be accepted, instead of the number of incomplete connection requests. The maximum length of the queue for incomplete sockets can be set using /proc/sys/net/ipv4/tcp_max_syn_backlog.

If the TCP implementation in Linux receives the ACK packet of the 3-way handshake and the accept queue is full, it will basically ignore that packet. At first, this sounds strange, but remember that there is a timer associated with the SYN RECEIVED state: if the ACK packet is not received (or if it is ignored, as in the case considered here), then the TCP implementation will resend the SYN/ACK packet.

Since the TCP implementation on the client side gets multiple SYN/ACK packets, it will assume that the ACK packet was lost and resend it (see the lines with TCP Dup ACK in the above trace). If the application on the server side reduces the backlog (i.e. consumes an entry from the accept queue) before the maximum number of SYN/ACK retries has been reached, then the TCP implementation will eventually process one of the duplicate ACKs, transition the state of the connection from SYN RECEIVED to ESTABLISHED and add it to the accept queue. Otherwise, the client will eventually get a RST packet (as in the sample shown above).

Note

The backlog argument to listen() is only advisory. POSIX says:

The backlog argument provides a hint to the implementation which the implementation shall use to limit the number of outstanding connections in the socket's listen queue.

Ref
What is “backlog” in TCP connections?
How TCP backlog works in Linux
TCP backlog works not as expected in Linux
https://stackoverflow.com/questions/5111040/listen-ignores-the-backlog-argument