xtaci / kcp-go

A Crypto-Secure Reliable-UDP Library for golang with FEC
MIT License
4.12k stars 737 forks source link

UDPSession no longer flushes output queue on Close #273

Open cohosh opened 3 months ago

cohosh commented 3 months ago

Ever since https://github.com/xtaci/kcp-go/commit/651dc4a6a4ff3226775d4df48632498f2125d31f, a UDPSession will not flush queued outgoing packets when the connection is closed. Instead, there is a race condition in the select call of postProcess when the s.die channel is closed in Close:

        select {
        case buf := <-s.chPostProcessing: // dequeue from post processing

[snip]
            // notify chCork only when chPostProcessing is empty
            if len(s.chPostProcessing) == 0 {
                select {
                case chCork <- struct{}{}:
                default:
                }
            }
        case <-chCork: // emulate a corked socket
            if len(txqueue) > 0 {
                s.tx(txqueue)
                // recycle
                for k := range txqueue {
                    xmitBuf.Put(txqueue[k].Buffers[0])
                    txqueue[k].Buffers = nil
                }
                txqueue = txqueue[:0]
            }

        case <-s.die:
            return
        }

if Close is called immediately after a Write, the queued packets will not be sent. This wasn't an issue before because the uncork function was called from within Close.

I noticed this because one of our unit tests started failing when we tried to update the version of this library.

xtaci commented 3 months ago

https://github.com/xtaci/kcp-go/compare/master...hotfix273 HI @cohosh, can you try this commit?

cohosh commented 3 months ago

Thanks, that got rid of some of the race conditions but it looks like there are some remaining and our tests are still failing.

There's one in the output callback:

// delivery to post processing
select {
case sess.chPostProcessing <- bts:
case <-sess.die:
return
}

We could probably just remove the select statement here and always send the bytes to sess.chPostProcessing.

I just tried this locally but even with this fix there's still another problem where by the time the output callback sends to sess.chPostProcessing, the select statement in postProcess has already died because at the time, there wasn't anything in that channel.

xtaci commented 3 months ago

Thanks, that got rid of some of the race conditions but it looks like there are some remaining and our tests are still failing.

There's one in the output callback:

// delivery to post processing
select {
case sess.chPostProcessing <- bts:
case <-sess.die:
return
}

We could probably just remove the select statement here and always send the bytes to sess.chPostProcessing.

No,we can't do that, if chPostProcessing is full(because postProcess exits). It blocks forever on waiting for sending.

I just tried this locally but even with this fix there's still another problem where by the time the output callback sends to sess.chPostProcessing, the select statement in postProcess has already died because at the time, there wasn't anything in that channel.

Yup, The thing is a session cannot wait forever. So If 'Close()' is called, the session makes the 'best-effort delivery'.