Open hbhasker opened 5 years ago
Another thing that Linux does allow is disabling the delayed ack for a shortwhile by setting QUICK_ACK but its not permanent and linux can decide to enable delayed ACK despite QUICK_ACK being set. Linux TCP has logic to decide when to enter/exit delayed ack based on heuristics to decide when sender might be in slow-start etc and will enable/disable delayed acks accordingly.
That said this paper https://arxiv.org/pdf/1901.01863.pdf has an interesting section on how delayed ACK's are not as useful today due to the fact that most stacks adjust congestion windows correctly with stretch ACKs.
Also I believe just adding a fixed timer for the delayed ACK may do more harm than good. We would need something similar to linux to make use of it in an effective manner.
Netstack's ACK logic is not really compliant with TCP standards/linux or BSD.
Netstack today sends an ACK everytime it processes a batch of packets. The batch can be 1 packet or upto maxSegmentsPerWake which is defined to be 100. There is an upside and a downside to this.
a) When packets are small and come 1 at a time, we end up generating 1 ack per packet and we will never piggy back on the response to a small request.
b) When packets come in burst we generate fewer ACK's as we send 1 ACK per N packets rather than than recommended one ACK for every 2 * MSS bytes of data received.
b) Also ensures that the ACK clock keeps moving and also prevents issues where the peer may not handle these ACK's that ack a large amount of data. Some TCP stacks do not handle these Stretch ACK's properly and can be slow to grow their congestion window.
The downside of b) is that in case of unidirectional transfers (like pure HTTP, not HTTPS which does send some data back constantly in terms of TLS alerts etc) netstack can end up generating more ACK's.
Further without delayed acks we can't really implement b) as we may never ACK a segment if the amount of data received was less than 2 * MSS.
Delayed ACK's should help reduce number of pure ACK's generated in cases where the traffic is a lot more request/response based. But can cause some additional latency in other cases.
Linux defaults to a small delayed ack timeout of 40ms and tunes it further down based on RTT. FreeBSD defaults to I believe 100ms but does not really tune it down dynamically.
I believe to start we should implement a 40ms delayed ack timer and provide a sysctl to turn it on/off dynamically as required.