Document TCP endpoint mutexes

In general the two rules we need to follow are

e.Mu can be acquired in syscall context or in worker context.
the protocol goroutine can acquire e.Mu while holding e.WorkMu.
The only place where we used to violate this was in tcp/endpoint.Write() where we used to acquire workMu while holding e.mu and call handleWrite. It just so happened handleWrite did not acquire e.mu unless a FIN was being sent which can't happen if a Write is still being permitted. But after a recent change we release e.mu before we acquire e.workMu but that introduced a separate issue now where a Close and Write() can race and a Close() can close the socket before handleWrite runs and cause handleWrite to panic due to a nil route reference.

The other issue is some of the state transitions are done incorrect places. eg. Transition to Shutdown states (FIN-WAIT-1) etc should really be done in syscall context when Shutdown() is being handled and not when the FIN is actually sent. That will prevent the edge case where the FIN requires a state transition in handleWrite.

Long term I am of the opinion we should just get rid of the workMutex altogether and only use endpoint mutex for everything. This lock ordering is complicated and its not clear to me if it buys us anything.

One solution will be to change e.mu to be similar to e.workMu such that it supports a TryLock() method.

What linux does is when a segment comes in it tries to acquire the lock and if the lock is held by the upper-half then it just queues it to backlog and goes away.

Then in the _sock_unlock() it processes the backlog before releasing the endpoint lock. We could do a similar thing where in the protocol goroutine we try to process the packet if we acquire the lock if not then when the syscall releases the endpoint lock we process the backlog before releasing the lock and return. It also ensures that some of the inbound processing happens inline in the syscall goroutine.

google / gvisor

Document TCP endpoint mutexes #357