maxsharabayko / srt-xtransmit

SRT transmission application used for testing
36 stars 17 forks source link

Memory Usage Growth up to a Limit on Reconnection #96

Open maxsharabayko opened 3 days ago

maxsharabayko commented 3 days ago

With every new connection, the CUnitQueue capacity keeps increasing.

Steps to Reproduce

srt-xtransmit receive srt://:4200?groupconnect=1 --reconnect -v

srt-xtransmit generate srt://ip:4200 srt://ip:4200 --sendrate 5Mbps --duration 5s --reconnect -v

Reason

Memory grows upon reconnection because sender breaks the connection and establishes a new one almost immediately due to low RTT. Hence it starts sending packets over those member links. Meanwhile, the receiver is still reading packets from a group. A new group connection is in the backlog and is not yet accepted by the application. So all those packets are stored in the receiver buffer for quite some time.

(normal flow, about 60 packets in the RCV buffer)
18:06:48.811000/T27156!W:SRT.br: @452069845 RCV buffer readMessage m_iMaxPosOff 59

(T0 - meanwhile reconnection happens in the background)
18:06:48.829000/T22900*E:SRT.gm: addGroup: @1525811665
18:06:48.830000/T22900.N:SRT.gm: group/add: adding member @452069842 into group $1525811665
18:06:48.830000/T22900.N:SRT.cn: @452069842 connection on listener @452069846 (127.0.0.1:4200) from peer @1044177265 (127.0.0.1:57913)
18:06:48.831000/T22900.N:SRT.qr: @452069842: SOCKET pending for connection - ADDING TO RCV QUEUE/MAP
18:06:48.831000/T22900.N:SRT.gm: group/add: adding member @452069840 into group $1525811665
18:06:48.832000/T22900.N:SRT.cn: @452069840 connection on listener @452069846 (127.0.0.1:4200) from peer @1044177264 (127.0.0.1:57914)
18:06:48.833000/T22900.N:SRT.qr: @452069840: SOCKET pending for connection - ADDING TO RCV QUEUE/MAP
18:06:48.833000/T22900.N:SRT.qr: CUnitQueue::state: Capacity 256 units, 108 in use.

(T+10ms - previous group members are broken)
18:06:48.843000/T22900.N:SRT.qr: @452069845: SOCKET broken, REMOVING FROM RCV QUEUE/MAP.
18:06:48.843000/T22900.N:SRT.qr: @452069843: SOCKET broken, REMOVING FROM RCV QUEUE/MAP.
18:06:48.869000/T27156!W:SRT.br: @452069845 RCV buffer readMessage m_iMaxPosOff 38

(T+60ms members *40 and *42 are receiving packets, the group is not accepted by the app yet)
18:06:48.888000/T22900!W:SRT.br: @452069840 RCV buffer insert m_iMaxPosOff 19
18:06:48.889000/T22900!W:SRT.br: @452069842 RCV buffer insert m_iMaxPosOff 19
18:06:48.931000/T27156!W:SRT.br: @452069845 RCV buffer readMessage m_iMaxPosOff 9

(T+245ms  - still receiving, but not reading, capacity of the memory pool has to be increased)
18:06:49.074000/T22900!W:SRT.br: @452069842 RCV buffer insert m_iMaxPosOff 107
18:06:49.074000/T22900!W:SRT.br: @452069840 RCV buffer insert m_iMaxPosOff 107

(T+1010ms - now removing members)
18:06:49.808000/T22900.N:SRT.qr: CUnitQueue::increase: Capacity 1024 + 128 new units, 922 in use.
18:06:49.824000/T22900!W:SRT.br: @452069840 RCV buffer insert m_iMaxPosOff 463
18:06:49.824000/T22900!W:SRT.br: @452069842 RCV buffer insert m_iMaxPosOff 463
18:06:49.839000/T28584.N:SRT.gm: group/remove: removing member @452069843 from group $1525811668
18:06:49.839000/T28584.N:SRT.gm: group/remove: removing member @452069845 from group $1525811668
18:06:49.839000/T27156*E:SRT.gr: grp/recv: $1525811668: ABANDONING: opened=false connected=false
18:06:49.840194 [W] RECEIVE read::recv: Connection does not exist
18:06:49.885000/T22900!W:SRT.br: @452069840 RCV buffer insert m_iMaxPosOff 493
18:06:49.886000/T22900!W:SRT.br: @452069842 RCV buffer insert m_iMaxPosOff 493

(T+1788ms - finally application accepts new group connection and starts reading)
18:06:50.617041 [I] SOCKET::SRT @452069846 (srt://:4200) Accepted connection @1525811665. TSBPD Latency RCV 120ms, peer 0ms. KM state INVALID (RCV UNSECURED, SND UNSECURED). PB key length: -1. Cryptomode INVALID. Stream ID: not set.
18:06:50.617934 [I] RECEIVE Latency, us: avg n/a, min n/a, max n/a. Jitter: 0us. Delay Factor: 1us. Pkts: rcvd 0, reordered 0 (dist 0), lost 0, MD5 err 0, bad len 0.
18:06:50.620000/T27156!W:SRT.br: @452069842 CRcvBuffer.readMessage() : m_iMaxPosOff 800
18:06:50.624000/T27156!W:SRT.br: @452069842 CRcvBuffer.readMessage() : m_iMaxPosOff 700

(initial end-to-end delay on reconnection is 1.776 seconds instead of 120 ms)
18:06:51.626057 [I] RECEIVE Latency, us: avg 125384, min 124575, max 1776108. Jitter: 96us. Delay Factor: 1651344us. Pkts: rcvd 1259, reordered 0 (dist 0), lost 0, MD5 err 0, bad len 0.

Possible Solution

If further connections are possible move the reading from a socket to a separate thread instantiated upon every connection. This way the second connection would trigger a new receiving thread, the receiving would happen earlier and in parallel with the previous receiving.

maxsharabayko commented 3 days ago

image