stellar-deprecated / stellard

INACTIVE. Server in the Stellar network. Maintains the distributed ledger. Introduces and validates transactions. This repo is not in active development, it is being replaced by stellar-core.
Other
270 stars 60 forks source link

Fix race between queueing redirect and closing connection, close #46. #98

Closed graydon closed 10 years ago

graydon commented 10 years ago

What's happening here (as far as I can tell) is that the mtENDPOINTS packet generated during a redirect is being dropped. The sequence of events is:

The fix is as follows:

  1. detach() is split in two: detach() itself just posts a call to drainAndClose() to the strand. m_detaching is set in drainAndClose(), which is made idempotent.
  2. sendPacket() still checks m_detaching and drops a packet if it's true; but otherwise it always posts a call to the strand. sendPacketForce renamed queueOrWritePacket

These two points mean that a synchronous call sequence like

{ send(); detach(); }

in the caller definitely results in two events posted to the strand, in order. The send will make it through the test of m_detaching and get posted, because m_detaching won't have been set yet. Then further:

  1. queueOrWritePacket() is modified to queue packets to mSendQ if there's a write in progress (this formerly raced on the socket and the queue, as far as I can tell).
  2. drainAndClose() considers queue-not-empty or write-in-progress as "work pending" and returns early, doesn't fully close the socket.
  3. handleWrite re-posts queueOrSendPacket() or drainAndClose() based on the presence of work in mSendQ and the setting of m_detaching, respectively.

These three points mean that the strand gets drained into the socket if possible, the queue if necessary, with handleWrite reposting work that got queued every time a write finishes.

graydon commented 10 years ago

Updated with method names fixed to camelCase