Haivision / srt

Secure, Reliable, Transport
https://www.srtalliance.org
Mozilla Public License 2.0
3.06k stars 839 forks source link

RCV Loss List and concurrency in groups #2881

Open maxsharabayko opened 6 months ago

maxsharabayko commented 6 months ago

This issue is intended to consolidate and track the situation around the RCV loss list and concurrency.

The following issue was submitted for SRT v1.4.4 (https://github.com/Haivision/srt/issues/2192#issuecomment-983376741):

When processData() received 899264404, it was added into the recv buffer, after recvbuf_acklock was unlocked, the tsbpd thread played 899264404 and dropped 899264403, then processData() immediately insert 899264403 back into loss list, after this, getFirstLostSeq() keep return 899264403 To fix the followin issue

PR #2195 introduced the m_iLargestSeq to resolve the bug.

It was later reported in #2873, that if a consecutive loss occurs after more than m_iSeqNoTH to the previous loss record (already removed from the loss list), the loss record is not recorded in the loss list. An ACK packet with a wrong seqno is sent as a result.

This issue was addressed in #2877 by resetting m_iLargestSeq to SRT_SEQNO_NONE if the list becomes empty. @gou4shi1 later shared his concerns this may reintroduce the initial concurrency issue.

gou4shi1 commented 6 months ago

I am afraid I forgot how does the m_iLargestSeq fixes the concurrency issue.

m_iLargestSeq is updated is both insert() and remove(), so for the case in

When processData() received 899264404, it was added into the recv buffer, after recvbuf_acklock was unlocked, the tsbpd thread played 899264404 and dropped 899264403, then processData() immediately insert 899264403 back into loss list, after this, getFirstLostSeq() keep return 899264403

after 899264403 is removed from loss list by tsbpd(), m_iLargestSeq is updated to 899264403, then processData() can not add back 899264403 into loss list later.

gou4shi1 commented 6 months ago

With PR #2877, after 899264403 is removed from loss list by tsbpd(), the loss list is empty, m_iLargestSeq is updated to -1, then processData() can add back 899264403 into loss list later. Thus I think PR #2877 is not a correct fix.

gou4shi1 commented 6 months ago

We can update m_iLargestSeq in some places (e.g. ack) to fix the m_iSeqNoTH issue in #2873. @kura979 What do you think?

gou4shi1 commented 6 months ago

A similar issue is #1895, caused by seqno wrap around.

kura979 commented 6 months ago

@gou4shi1

We can update m_iLargestSeq in some places (e.g. ack) to fix the m_iSeqNoTH issue in https://github.com/Haivision/srt/issues/2873.

It's OK if it means to keep updating m_iLargestSeq so that the distance is less than m_iSeqNoTH.

maxsharabayko commented 1 month ago

PR #2931 temporary resolves this until #2527 is merged.