google / gvisor

Application Kernel for Containers
https://gvisor.dev
Apache License 2.0
15.63k stars 1.29k forks source link

Document TCP endpoint mutexes #357

Open iangudger opened 5 years ago

iangudger commented 5 years ago

There are a few mutexes in the TCP endpoint. Their relative lock ordering and even exactly what they protect is not well documented.

hbhasker commented 4 years ago

In general the two rules we need to follow are

The other issue is some of the state transitions are done incorrect places. eg. Transition to Shutdown states (FIN-WAIT-1) etc should really be done in syscall context when Shutdown() is being handled and not when the FIN is actually sent. That will prevent the edge case where the FIN requires a state transition in handleWrite.

Long term I am of the opinion we should just get rid of the workMutex altogether and only use endpoint mutex for everything. This lock ordering is complicated and its not clear to me if it buys us anything.

One solution will be to change e.mu to be similar to e.workMu such that it supports a TryLock() method.

What linux does is when a segment comes in it tries to acquire the lock and if the lock is held by the upper-half then it just queues it to backlog and goes away.

Then in the _sock_unlock() it processes the backlog before releasing the endpoint lock. We could do a similar thing where in the protocol goroutine we try to process the packet if we acquire the lock if not then when the syscall releases the endpoint lock we process the backlog before releasing the lock and return. It also ensures that some of the inbound processing happens inline in the syscall goroutine.