Critical issue in network layer (NetEvent)

just-bank commented 7 years ago

We have found a very nasty bug in TNL (Torque Network Library) code, which was since the very beginning (TGE1.3+ or earlier), so I will call this a "15++ y.o. bug" in Torque.

Issue:

GuaranteedOrdered events are stopped from being processed which is caused by packed loss.

How to replicate:

Start two instances of Torque engine (any engine version, T2, TGE1.3+ or latest T3D). Connect one to another via GameConnection. Call %conn.setSimulatedNetParams(0.8, 0); on both sides, so we simulate a 80% of packet loss. Our test functions:

function clientCmdTest()
{
   echo("clientCmdTest");
   commandToServer('test');
}
function serverCmdTest(%c)
{
   echo("serverCmdTest" SPC %c);
   commandToClient(%c, 'test');
}

Now, on any side send 'test' cmd to another side so its start looping (doing infinite ping-pong).

It could take some time until it stops working (it took 17 minutes for me in a first run to catch the bug, second run - 35 minutes). In production we hit it quite often, with 500+ NetConnections online the bug makes itself felt every few hours.

What happens:

As example we take "10" as current event sequence count. Server sends 'test' event[1] to the client:

if(!bstream->writeFlag(ev->mSeqCount == prevSeq + 1)) // the prev seq was 9, so not packing 7 bits as 10 == 9+1
    bstream->writeInt(ev->mSeqCount & 0x7F, 7); // not using this

... but the packet gets lost! Server re-sends the same event[2] to the client by writing to BitStream the mSeqCount:

if(!bstream->writeFlag(ev->mSeqCount == prevSeq + 1)) // 10 != 10+1
    bstream->writeInt(ev->mSeqCount & 0x7F, 7); // writing 10 here

as the event is not the next one (we are re-sending).

Client receives event[2] and unpacking the seq via:

    if(bstream->readFlag())
        seq = (prevSeq + 1) & 0x7f; // not using this
    else
        seq = bstream->readInt(7); // <-- here we are reading 10

Going down in the code we call evt->unpack() and do some checks:

    if(seq < mNextRecvEventSeq) // comparing 10 with 10, all is fine, nothing to do
        seq += 128; // skipped

and process it inside the

while (mWaitSeqEvents && mWaitSeqEvents->mSeqCount == mNextRecvEventSeq) // 10 == 10
{
    mNextRecvEventSeq++; // Will increment to 11

loop.

The client sends ACK for that packet we have just processed, and all is fine until that packet with ACK is dropped by some reason (by simulating packet drop or some real network issues). The server will resend the same event AGAIN: Server re-sends the same event[3] to the client by writing to BitStream the mSeqCount:

if(!bstream->writeFlag(ev->mSeqCount == prevSeq + 1)) // 10 != 10+1
    bstream->writeInt(ev->mSeqCount & 0x7F, 7); // writing 10

as the event is not the next one (we are re-sending).

Client receives event[3] and unpacking the seq via:

    if(bstream->readFlag())
        seq = (prevSeq + 1) & 0x7f; // not using this
    else
        seq = bstream->readInt(7); // <-- here we are reading 10

Now is the "magic" part of code:

    if(seq < mNextRecvEventSeq) // comparing 10 with 11, received seq is less
        seq += 128; // seq becomes 138!!!

Now our while loop:

while (mWaitSeqEvents && mWaitSeqEvents->mSeqCount == mNextRecvEventSeq) // 138 != 10
{
    mNextRecvEventSeq++; // will never get in here

We have ~~fucked up~~ stuck. Any further event we will receive from the server will be unpacked but never processed. The events from the client to the server are sent and processed fine (in case it will not lockout the same way).

Solution:

The simplest solution is to use writeCussedU32(): replace code

if(!bstream->writeFlag(ev->mSeqCount == prevSeq + 1))
    bstream->writeInt(ev->mSeqCount & 0x7F, 7);

with:

bstream->writeCussedU32(static_cast<U32>(ev->mSeqCount));

replace code

if(bstream->readFlag())
    seq = (prevSeq + 1) & 0x7f;
else
    seq = bstream->readInt(7);

with

seq = static_cast<S32>(bstream->readCussedU32());

remove the following:

   seq |= (mNextRecvEventSeq & ~0x7F);
   if(seq < mNextRecvEventSeq)
      seq += 128;

as we don't need it anymore.

replace the whole while loop:

   while(mWaitSeqEvents && mWaitSeqEvents->mSeqCount == mNextRecvEventSeq)
   {
      mNextRecvEventSeq++;
      NetEventNote *temp = mWaitSeqEvents;
      mWaitSeqEvents = temp->mNextEvent;

      //Con::printf("EVT  %d: PROCESS - %d", getId(), temp->mSeqCount);
      temp->mEvent->process(this);
      temp->mEvent->decRef();
      mEventNoteChunker.free(temp);
      if(mErrorBuffer.isNotEmpty())
         return;
   }

with:

   while(mWaitSeqEvents && mWaitSeqEvents->mSeqCount <= mNextRecvEventSeq)
   {
      NetEventNote *temp = mWaitSeqEvents;
      mWaitSeqEvents = temp->mNextEvent;

      //Con::printf("EVT  %d: PROCESS - %d", getId(), temp->mSeqCount);
      if(temp->mSeqCount == mNextRecvEventSeq)
      {
         // We process only expected, any "already-processed" events will be skipped.
         temp->mEvent->process(this);
         // Advance counter to the next one
         mNextRecvEventSeq++;
      }
      temp->mEvent->decRef();
      mEventNoteChunker.free(temp);
      if(mErrorBuffer.isNotEmpty())
         return;
   }

Side note:

Even with this change we can't prevent events with mGuaranteeType == NetEvent::Guaranteed from being processed more than once. Those are sent together with NetEvent::GuaranteedOrdered in the same packet, so it will call:

      if(unguaranteedPhase)
      {
         evt->process(this);
         evt = NULL;
         if(mErrorBuffer.isNotEmpty())
            return;
         continue;
      }

When our ACK packet (we have sent to server) is dropped or lost, the server will resend it to client and the same Guaranteed event will be processed again! This could lead to some undefined behaviour in game-play, depending on how this type of events are used in engine by game developers.

For us we have decided to remove the NetEvent::Guaranteed from our engine and always use NetEvent::GuaranteedOrdered. From what I see, there are no use of this atm in T3D MIT, so it can be safely (?) removed.

NetEvent::Unguaranteed are working the same way as it was before - they are never being send more than once.

Thinking aloud

The downside of this is: can't use Guaranteed type of events.
More traffic for a long game-play sessions (when NetEvent counter hits a large amount, so the writeCussedU32() gives disadvantage to store large numbers).

The End.

Before I submit PR, can someone comment on this?

I think this should be applied on any/all Torque engine(s) (T2D, OpenTNL, etc), so after we finalize what we do with T3D we can spread this info to other projects.

Areloch commented 7 years ago

Very interesting find!

What would be the usage-case of using NetEvent::Guaranteed over GuaranteedOrdered?

just-bank commented 7 years ago

@Areloch NetEvent::Guaranteed assumed to be "top priority" and always sent before any GuaranteedOrdered.

Like you have lots of "regular" GuaranteedOrdered events in queue scheduled to be sent and you are posting Guaranteed - its added on top of the GuaranteedOrdered queue so its delivered first (together with unguaranteed, which are packed/sent only once).

But in reality I've never seen any usage of this, every project I've seen was simply doing:

class MyNetEvent : public NetEvent
{ ... }

and never re-declared mGuaranteeType, which by default is set to:

NetEvent() { mGuaranteeType = GuaranteedOrdered; }

Stock T3D don't use Guaranteed, and I've never used it by myself in AW and LiF.

dottools commented 7 years ago

Using read/writeCussedU32 doesn't completely solve problem as that just makes the packet sequence number range larger, but it'll break eventually if you exceed 2^31 sequenced events over a GameConnection.

NetEvent::Guaranteed was intentionally designed to not care about duplicate events, however NetEvent::GuaranteedOrdered does and why it exists, that's up to a higher level usercode to decide what to do about it. So that should be left alone in case anybody wants to, or is already using it.

The Problem:

I believe overall problem is that the sequence number range stored in the NetEvent packets, 7bits, is just too tiny for massive packet loss and high packet usage throughput. Here's why. Torque doesn't actually just use 7bits for sequence numbers, it's actually using the full 32bit (S32) value (really should be U32 for sanity sake), so effectively both sides keep track of sequence number and wraparound generation count (mSeqCount >> 7). Because of so much packet loss one side ends up having a different generation count than its peer, most likely the server side cause it sends the most data compared to the client. So then eventually the client gets desynced and waits forever for the right sequence that it'll never receive

The 7bit sequence number made sense back in the 1990s when everybody was on pretty much dial-up, ISDN, and best up to 1.5Mbit ADSL. Also Tribes games didn't really have much going on in terms of network traffic (~15kbytes/sec at most per client downstream) and they were more well aware of the limitations than we are, such as max garaunteed packets with no ACK before the engine loses track (7bits is 128 unique packets). So with a 80% packet loss and just making it exchange NetEvents as fast as possible it'll definitely cause wraparound generation sequence number desyncs eventually.

Case in point, this code checks for sequence number wrap around and accounts for it: https://github.com/GarageGames/Torque3D/blob/development/Engine/source/sim/netEvent.cpp#L352-L354 The last while loop is what does the guaranteed ordered events: https://github.com/GarageGames/Torque3D/blob/development/Engine/source/sim/netEvent.cpp#L369-L381

Notice that it's checking for packets received, waiting queue, and only processing them in sequence order, and not just the 7bit sequence number, it's using the full wraparound generation + 7bit sequence numbers. So it makes sense why reworking the code to use up to full 32bit sequence numbers across packets would make it easily recoverable.

What to do about it:

Going ahead with 32bit sequence numbers in packets is the quickest fix, but not necessarily the complete solution. Because you still have to account for wraparound. Because networking technologies are becoming faster not just in terms of bytes per second, but packets per second. Not only that, but because if you're using Torque for long-life running game servers, like MMO, then it'll have a higher chance of happening than you think it generally would.

What needs to be done is that NetEvent has to hold off sending new ordered NetEvents when oldest and newest pending ordered packet not ack'd sequence number difference (NewestSentSeqNum - OldestSentSeqNum) is more than (SeqNumBitSize >> 2), at most, to avoid wraparound generation desync.

Other notes:

      if(!bstream->writeFlag(ev->mSeqCount == prevSeq + 1))
         bstream->writeInt(ev->mSeqCount & 0x7F, 7);

Is a neat space-saving trick where it won't pack the next, literal in sequence, NetEvent's sequence number and it's apart of of the same network packet that'll be going out. So you might want to keep that similar procedure in there.

GarageGames / Torque3D