webrtc-rs / webrtc

A pure Rust implementation of WebRTC
https://webrtc.rs
Apache License 2.0
4.04k stars 359 forks source link

UDP socket leaks #608

Open tubzby opened 2 weeks ago

tubzby commented 2 weeks ago

I have a program working as a proxy to convert RTSP stream to WebRTC stream so I can visit my cameras on the browser, normally a connection is closed when I close the tab, and I'm monitoring the ICE state to close PeerConnection, every time I call PeerConnection.close().await it complains:

Aug 28 14:46:56 debian10.toybrick rtsp2webrtc[23297]:  INFO avlib::pc_wrapper: close
Aug 28 14:46:56 debian10.toybrick rtsp2webrtc[23297]:  WARN webrtc::peer_connection::peer_connection_internal: Failed to accept RTCP SessionSRTP has been closed
Aug 28 14:46:56 debian10.toybrick rtsp2webrtc[23297]:  WARN webrtc::peer_connection::peer_connection_internal: Failed to accept RTP SessionSRTP has been closed
Aug 28 14:46:56 debian10.toybrick rtsp2webrtc[23297]:  WARN webrtc_ice::agent::agent_internal: [controlled]: Failed to close candidate tcp4 host 192.168.0.28:9: the agent is closed
Aug 28 14:46:56 debian10.toybrick rtsp2webrtc[23297]:  WARN webrtc_ice::agent::agent_internal: [controlled]: Failed to close candidate udp4 host 192.168.0.28:63422: the agent is closed
Aug 28 14:46:56 debian10.toybrick rtsp2webrtc[23297]:  WARN webrtc_ice::agent::agent_internal: [controlled]: Failed to close candidate udp4 srflx 58.250.221.39:63422 related 192.168.0.28:63422: the agent is closed
Aug 28 14:46:56 debian10.toybrick rtsp2webrtc[23297]:  WARN webrtc_ice::agent::agent_internal: [controlled]: Failed to close candidate udp4 relay 112.74.41.31:64537 related 58.250.221.39:63422: the agent is closed
Aug 28 14:46:56 debian10.toybrick rtsp2webrtc[23297]:  INFO webrtc_ice::agent::agent_internal: [controlled]: Setting new connection state: Closed

and all the UDP socket is never closed, I have confirmed that by lsof -n -p $pid.

tubzby commented 2 weeks ago

img_v3_02e6_b526a383-bfb3-49eb-8450-68d8298c727g

From the output of tokio-console, there are 2 tasks in warning state: "2 tasks have lost their waker", this might be the cause of leaks.

---EDIT 1-------

After adding some debug log, it seems there are an orphan task that never quit:

https://github.com/webrtc-rs/webrtc/blob/master/dtls/src/conn/mod.rs#L322

tubzby commented 2 weeks ago

I'm stuck, but with some findings:

  1. turn/src/client/mod.rs: Client created in gather_candiates_relay was never dropped.
  2. PeerConnection Agent AgentInternal were dropped.
  3. Candidate.close() was called, but for UdpSocket, it does nothing, it has to be dropped.

It might related to the screenshot above, for some reason, the task created at agent_gather.rs:773 lost its waker, so the UdpSocket never released.

tubzby commented 2 weeks ago

Update:

I have been using Arc::strong_count to monitor Arc<CandidateBase> and find out that the count is 6.

I created a tokio task to spy on it, so the actual reference count is 5 which causing this problem.

tubzby commented 2 weeks ago

Finally, It leads to one possible line: https://github.com/webrtc-rs/webrtc/blob/23bbc1fb7eb9962e19e03cd4b8645e7aee8926c4/ice/src/agent/agent_transport.rs#L244

AgentConn::checklist should be cleared on close.

But there is still 1 leaking UdpSocket.

rainliu commented 2 weeks ago

Great catch, could you submit a PR to fix it? Thanks

tubzby commented 2 weeks ago

No problem, will submit a PR if there are no more leaks.

tubzby commented 2 weeks ago

Anyway, I was doing Arc::strong_count to tackle this, is there a better way to do that? image

tubzby commented 2 weeks ago

Addition to: https://github.com/webrtc-rs/webrtc/issues/608#issuecomment-2322699742

AgentConn.selected_pair should also be cleared.

tubzby commented 2 weeks ago

https://github.com/webrtc-rs/webrtc/blob/23bbc1fb7eb9962e19e03cd4b8645e7aee8926c4/ice/src/agent/mod.rs#L498

mdns.query is never quit.

@rainliu There's a TODO comment line here, what's your suggestion?

tubzby commented 2 weeks ago

image

DTLSConn task stucked: https://github.com/webrtc-rs/webrtc/blob/23bbc1fb7eb9962e19e03cd4b8645e7aee8926c4/dtls/src/conn/mod.rs#L322

tubzby commented 3 days ago

I'm quite confused about why we should release these structures manually, is there a reference loop for Arc?