google / gvisor

Application Kernel for Containers
https://gvisor.dev
Apache License 2.0
15.85k stars 1.3k forks source link

netstack: GetRemoteAddr on closed TCP endpoint is nil after endpoint is disconnected #9905

Closed vejipe closed 9 months ago

vejipe commented 10 months ago

Description

In tcpip/adapters/gonet, gVisor doesn't store addresses in the *gonet.TCPConn object. Instead it calls the GetRemoteAddr() method on the endpoint managed by tcpip/transport/tcp... and that method for has a guard on whether the socket is connected.

So what we are seeing is that calling GetRemoteAddr() on a socket that is already disconnected will return nil.

This might be a bug in tcpip/transport/tcp: that method is used to implement the getpeername() syscall, so that syscall has the same problem. It is not specified whether getpeername() should return the former remote address of TCP connections that the kernel know are closed (i.e. because they have been closed by the peer), but a quick test does show that Linux does continue to return the former remote address in this case. Test case attached test.go.txt

Related test:

Is this feature related to a specific bug?

Potentially related to https://github.com/google/gvisor/issues/3780

Do you have a specific solution in mind?

No response

vejipe commented 9 months ago

Upon deeper analysis, it turns out that I was wrong. The linux kernel does behave the same way as gVisor: if the socket is not connected anymore it returns ENOTCONN. It is tricky to test, as you need to wait for the socket to get out of TIME_WAIT state.

konstantin-s-bogom commented 9 months ago

Thanks for investigating this further! It does look like gVisor respects default TIME_WAIT correctly. Did you see a discrepancy between gVisor/Linux with your original test though? Considering that the gVisor netstack should have kept the socket in TIME_WAIT for 60s I don't think there should have been a difference.