A problem in crossing connection handling.

yihuang commented 6 years ago

Let me describe the whole process first:

Assuming A's address < B's address

* A.1 A send connect request to B, create RemoteEndPointB object
* B.1 B send connect request to A, create RemoteEndPointA object

* B.2 B get connect request from A, but find RemoteEndPointA object already exists,
      check local_addr(B) > remote_addr(A), should accept the request,
      wait for crossed MVar which would be fired in step B.3,
      reply with Accepted,
      re-use the RemoteEndPoint object, and change state to Valid
      start processing messages.

* A.2 A get connect request from B, find out RemoteEndPointB object exists,
      and local_addr(A) < remote_addr(B), should reject the request,
      just reply with Crossed message.

* B.3 B's connect request get Crossed reply,
      (should we remove the RemoteEndPoint object, or just leave in Init state?),
      fire the crossed MVar,
      close the socket,
      end with failure.
      (better if B can wait for step B.2 finish, and reuse the connection)

* A.3 A's connect request get Accepted reply,
      start processing messages.

The problem is in step B.2, after B waits for the crossed Event, B re-uses the RemoteEndPoint value(code), but in step B.3 B will close the RemoteEndPoint object and remove it(code). I think it's problematic when step B.2 and B.3 are executing concurrently. Maybe we should leave RemoteEndPoint in init state in step B.3, then step B.2 will continue to initialize it further.

Another problem is the wait for crossed MVar in step B.2 is not with a timeout, if B's connect request failed to get Crossed reply for network issues, B.2 would wait indefinitely.

facundominguez commented 6 years ago

Thanks for bringing this up.

Maybe we should leave RemoteEndPoint in init state in step B.3, then step B.2 will continue to initialize it further.

That could work provided that B.2 doesn't die first due to a connection timeout expiring, as the later problem seems call for.

Another problem is the wait for crossed MVar in step B.2 is not with a timeout, if B's connect request failed to get Crossed reply for network issues, B.2 would wait indefinitely.

Looks like a problem indeed. Possibly solvable with a transport-wide timeout parameter.

facundominguez commented 6 years ago

Maybe we should leave RemoteEndPoint in init state in step B.3, then step B.2 will continue to initialize it further.

I have a vague recollection that the remote endpoints could pile up and leak if we don't remove them here, but I'd need to check the git history to be sure.

yihuang commented 6 years ago

I think B.3 should wait for resolve MVar with timeout after firing crossed MVar, if B.2 get A's connect request successfully, then B.3 got notified, and re-use the RemoteEndPoint happily. If B.2 fails to get A's connect request, B.3 can still timeout.

facundominguez commented 6 years ago

Here's a better fix, I believe.

Have B.2 grab the remote endpoint state preventing any thread from modifying it. Decide if it is a crossed request. If it is, restore the old remote endpoint state. If it is not a crossed request, put the remote endpoint in Valid state.

This will fix the first issue because B.3 only removes the remote endpoint if it is in Init state. And it will fix the second issue, because B.2 does no longer need to wait on the crossed MVar.

yihuang commented 6 years ago

I agree, it seems there is no race condition if we have this lock.

andriytk commented 5 years ago

A sequence diagram to facilitate the problem understanding:

andriytk commented 5 years ago

The problem is in step B.2, after B waits for the crossed Event, B re-uses the RemoteEndPoint value(code), but in step B.3 B will close the RemoteEndPoint object and remove it(code). I think it's problematic when step B.2 and B.3 are executing concurrently.

Actually, B.3 do wait for the "resolved" MVar (at findRemoteEndPoint) before checking for the possible removal of the RemoteEndPoint at createConnectionTo. But the "resolved" will be put and visible to B.3 only after B.2 gets the "crossed" MVar from B.3, sends ConnectionRequestAccepted and set RemoteEndPointValid at resolveInit. And at the time of checking for the removal at createConnectionTo, the endpoint is no longer in the Init state anymore (modifyMVar for the remoteState makes sure that transition from the Init state is completed). So it looks like we are safe here.

Another problem is the wait for crossed MVar in step B.2 is not with a timeout...

As for this problem, we can tryPut the "crossed" MVar at resolveInit. If there will be no ConnectionRequestCrossed reply for B.1 request, it will timeout and call the resolveInit anyway at setupRemoteEndPoint.

andriytk commented 5 years ago

Another problem is the wait for crossed MVar in step B.2 is not with a timeout...

As for this problem, we can tryPut the "crossed" MVar at resolveInit. If there will be no ConnectionRequestCrossed reply for B.1 request, it will timeout and call the resolveInit anyway at setupRemoteEndPoint.

See https://github.com/haskell-distributed/network-transport-tcp/pull/81.

facundominguez commented 5 years ago

I agree with Andriy. Fixed via #81.

haskell-distributed / network-transport-tcp

A problem in crossing connection handling. #74