google / gvisor

Application Kernel for Containers
https://gvisor.dev
Apache License 2.0
15.33k stars 1.27k forks source link

TCP conntrack does not honor window sizes correctly. #6734

Open hbhasker opened 2 years ago

hbhasker commented 2 years ago

Description

tcpconntrack package uses the segment's window size specified in the TCP segment header to determine if an incoming segment is acceptable. But it doesn't take into account any window scaling that might be in use as it completely ignores TCP options in the SYN(see: https://github.com/google/gvisor/blob/81f284f9d4af4170c86ad182e2f6f2cc9b3c565a/pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go#L70).

Which means the check here https://github.com/google/gvisor/blob/81f284f9d4af4170c86ad182e2f6f2cc9b3c565a/pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go#L227 will incorrectly reject packets as unacceptable for anything that might lie beyond the scaled down value of the actual window.

It also means this check https://github.com/google/gvisor/blob/81f284f9d4af4170c86ad182e2f6f2cc9b3c565a/pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go#L227 will incorrectly determine the end of the acceptable window resulting in the FIN sequence being incorrect here

https://github.com/google/gvisor/blob/81f284f9d4af4170c86ad182e2f6f2cc9b3c565a/pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go#L266

Which also means that the s.closed() will incorrectly return true once the sequence number that we incorrectly assumed to be the fin sequence number is ACKed.

TLDR; we need to parse options in the SYN/SYN-ACK to determine if the window scaling is in effect and apply appropriate window scaling in each direction as its not required to be symmetric when calculating the actual window size.

Steps to reproduce

No response

runsc version

No response

docker version (if using docker)

No response

uname

No response

kubectl (if using Kubernetes)

No response

repo state (if built from source)

No response

runsc debug logs (if available)

No response

hbhasker commented 2 years ago

Correction the FIN sequence number is right above , only the acceptable check needs to be fixed. The end used to initialize fin sequence is overwritten with the logicalLen of the segment which is correct.

https://github.com/google/gvisor/blob/81f284f9d4af4170c86ad182e2f6f2cc9b3c565a/pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go#L259

kevinGC commented 2 years ago

Looks like we actually don't reject packets for being larger than the unscaled window. Because tcpconntrack was written pre-PacketBuffer, it assumes that passed-in header.TCP buffers also contain the payload. You can see this assumption when calculating the payload size:

https://github.com/google/gvisor/blob/81f284f9d4af4170c86ad182e2f6f2cc9b3c565a/pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go#L329-L331

But we always pass only the TCP header without payload. The real bug is that we're always checking the window size against an apparently payload-less packet, so we never reject packets based on rwnd.

hbhasker commented 2 years ago

That is a separate bug, but we do need to use the scaled window because otherwise we could reject valid packets. Also I am wondering about a few more issues with our conntrack implementation.

IT doesn't deal well with any reordering. eg. if a FIN is received out of order i.e. if the peer does the following.

<--- data <--- data <--- FIN

but we see it as <--- FIN <--- data <--- data

and lets say WINDOW scaling is in effect and its possible s.end used here https://source.corp.google.com/piper///depot/google3/third_party/gvisor/pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go;rcl=321003353;l=227 could be a really small value resulting in us just ignoring the FIN and returning Alive till the peer retransmits the FIN etc.

kevinGC commented 2 years ago

Fixed by 5e984d5aa2c3f98934040e199d460c74744e5799.

github-actions[bot] commented 2 years ago

There are TODOs still referencing this issue:

  1. pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go:83: Cache TCP options instead of re-parsing them.
  2. pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go:186: Cache TCP options instead of re-parsing them.

Search TODO

github-actions[bot] commented 10 months ago

A friendly reminder that this issue had no activity for 120 days.

github-actions[bot] commented 7 months ago

This issue has been closed due to lack of activity.

github-actions[bot] commented 6 months ago

There are TODOs still referencing this issue:

  1. pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go:83: Cache TCP options instead of re-parsing them.
  2. pkg/tcpip/transport/tcpconntrack/tcp_conntrack.go:186: Cache TCP options instead of re-parsing them.

Search TODO

github-actions[bot] commented 2 months ago

A friendly reminder that this issue had no activity for 120 days.