cisco / exanic-software

ExaNIC drivers, utilities and development libraries
Other
144 stars 82 forks source link

Possible race condition with sending packets and ACKs #16

Open abbradar opened 5 years ago

abbradar commented 5 years ago

We have discovered a possible race condition between exasock_tcp_send_advance and ACKs to remote hosts' packets. Consider this case:

In this case remote host receives an "impossible ACK": under no normal circumstances SEQ in packet B can be lesser than SEQ in packet A, yet because kernel module and a userspace application sending packets run in different threads this can theoretically happen. We have observed this in real setting because we have random delays possible between sending packets via libexanic and calling exasock_tcp_send_advance.

A different vendor, Solarflare, handles this by deliberately setting SEQ value in empty ACKs to a value from the future, namely send_seq + min(rwnd_len, cwnd_len, mss) (a bit more complicated than that but you get the picture). This way technically those ACKs are always correct and just appear severely out of order. An immediate downside of this solution is that traffic sent this way appears severely broken to various analysis tools like Wireshark, and for a good reason so.

Is this race condition dangerous in the wild? Do you have any data on how do various TCP stacks handle "impossible ACKs"? Are there any other solutions to this problem that you see besides the one proposed? We have a patch that implements it in case you wish to experiment but because of the downsides above obviously it's not fit for mainline as is.

QiweiWen commented 5 years ago

Hi Nikolay,

Thanks so much for getting to the bottom of the issue and submitting the PR. There are indeed places in our TCP stack where the assumption of before_eq(send_ack, send_seq) is broken by the extension API.

We've looked at your PR, like it very much and are in the process of implementing a fix for the problem in our internal software repo.

Again, thanks for choosing our product and we greatly appreciate your ongoing contributions.

Best Regards, Dave