Closed maxsharabayko closed 3 years ago
Hi again Max!
we faced with following panic related to CRcvBuffer
/usr/bin/nimble(_Z14signal_handleri+0x71) [0x5684b1]
/lib64/libpthread.so.0(+0xf630) [0x7fa2e47d5630]
/lib64/libc.so.6(gsignal+0x37) [0x7fa2e3074387]
/lib64/libc.so.6(abort+0x148) [0x7fa2e3075a78]
/lib64/libc.so.6(+0x78ed7) [0x7fa2e30b6ed7]
/lib64/libc.so.6(+0x81299) [0x7fa2e30bf299]
/lib64/libsrt-nimble.so.1(_ZN10CRcvBufferD1Ev+0x46) [0x7fa2da3d0106]
/lib64/libsrt-nimble.so.1(_ZN4CUDTD1Ev+0xa0) [0x7fa2da3e06a0]
/lib64/libsrt-nimble.so.1(_ZN10CUDTSocketD1Ev+0x1c) [0x7fa2da3c623c]
/lib64/libsrt-nimble.so.1(_ZN10CUDTUnited12removeSocketEi+0x2d9) [0x7fa2da3c8739]
/lib64/libsrt-nimble.so.1(_ZN10CUDTUnited18checkBrokenSocketsEv+0x4fa) [0x7fa2da3c924a]
/lib64/libsrt-nimble.so.1(_ZN10CUDTUnited14garbageCollectEPv+0x50) [0x7fa2da3c9370]
/lib64/libpthread.so.0(+0x7ea5) [0x7fa2e47cdea5]
/lib64/libc.so.6(clone+0x6d) [0x7fa2e313c96d]
it's very similar to https://github.com/Haivision/srt/issues/1604 as this panic
is in destructor of CRcvBuffer and it's inside delete[] m_pUnit;
So my guess is that some code in CRcvBuffer in writing into m_pUnit[-1].
there are some playing with "-1" indexes in
I've made a stupid fix adding additional element to m_pUnit and move pointer to the next element so that m_pUnit[-1] will not lead to panic in destructor. I'll reply if this fix helps. But I think we have m_pUnit[-1] access problem with CRcvBuffer. I cannot reproduce this problem myself unfortunately
Hi @alexpokotilo Could you please submit a dedicated GitHub issue for this crash? And the proposed fix. Even if it is not very accurate, it may still be a good start.
Hi @maxsharabayko, done I wish I have more info right now. I'll post new findings there
Resolved by PR #1964.
Motivation
The current implementation of the receiver buffer has several disadvantages.
Unclear logic of some functions like
scanmsg()
.Const-correctness violations, e.g.
isRcvReady()
function changes object state.Entangled memory management and interaction with memory buffer
CUnitQueue
.Spurious notification in message mode. When one message spans over several packets the epoll is notified about the ability to read data BEFORE the end of the message is actually received and is ready to be read. PR #563 addresses this problem but potentially increases the performance burden. It was therefore postponed in favor of rewriting the receiver buffer.
Spurious read-readiness when drift value changes after read-readiness is signaled (issue #1255).
Receiver buffer waits for packets to be acknowledged before allowing to read them. As of now, the receiver buffer is only acknowledged on sending a Full ACK, which happens every 10 ms. Sending Lite ACK can happen earlier on higher bitrates, but "to save time on buffer processing" the receiver buffer is not acknowledged. Although 10 ms is a pretty good frequency, it can, however, limit some ultra low latency use cases without a worthy reason. See FR #1429.
TODO
Preparational Changes
CSync::wait_until
is now mapped toCondition::wait_until
#1680setupCC
to accessCUnitQueue
viaCRcvQueue
instead ofCRcvBuffer
#1681Receiver Buffer Functionality
readBufferToFile
UMSG_DROPREQ
. Does not seem to work in the existing implementation: can't drop missing packets, the gap will remain. Also can't tellmsgno
of missing packets. Should an issue be submitted?Protecting Concurrent Access
CRcvBuffer
and checking its state usingisRcvDataReady
should happen under them_RcvBufferLock
. Especially from the code of groups. See #2094.