algesten / str0m

A Sans I/O WebRTC implementation in Rust.
MIT License
332 stars 50 forks source link

`str0m` rejects newly formed candidate pairs for `srflx` candidate based on existing pair with `host` candidate #483

Open thomaseizinger opened 8 months ago

thomaseizinger commented 8 months ago

The situation we are testing is:

  1. Exchange candidates and have str0m nominate a pair
  2. Everything works as expected
  3. Add a new network interface and its respective candidates (host + srflx)

Expectation

New pairs are formed for all candidates and str0m nominates a new pair if it is better

Reality

str0m forms new pairs for the host candidate but rejects all pairs for the srflx one based on redundancy

Logs

2024-03-19T03:18:51.957408Z  INFO str0m::ice_::agent: Add local candidate: Candidate(host=192.168.193.95:54644/udp prio=2130705663)
2024-03-19T03:18:51.957425Z TRACE str0m::ice_::agent: Form pair local: Candidate(host=192.168.193.95:54644/udp prio=2130705663) remote: Candidate(host=172.17.0.2:58749/udp prio=2130706175)
2024-03-19T03:18:51.957437Z DEBUG str0m::ice_::agent: Add new pair CandidatePair(10-1 prio=9151311144248409598 state=Waiting attempts=0 unanswered=0 remote=0 last=None nom=None)
2024-03-19T03:18:51.987367Z  INFO str0m::ice_::agent: Add local candidate: Candidate(srflx=31.133.129.248:54644/udp base=192.168.193.95:54644 prio=1686109951)
2024-03-19T03:18:51.987386Z TRACE str0m::ice_::agent: Form pair local: Candidate(srflx=31.133.129.248:54644/udp base=192.168.193.95:54644 prio=1686109951) remote: Candidate(host=172.17.0.2:58749/udp prio=2130706175)
2024-03-19T03:18:51.987402Z DEBUG str0m::ice_::agent: Reject redundant pair, current: CandidatePair(10-1 prio=9151311144248409598 state=InProgress attempts=1 unanswered=1 remote=0 last=None nom=None) rejected: CandidatePair(11-1 prio=7241787101266574846 state=Waiting attempts=0 unanswered=0 remote=0 last=None nom=None)

Is this expected? Is a srflx candidate with the same base as a host candidate always redundant? Unless the nodes are in the same network, the remote can never reply to the host candidate so this particular candidate pair can't succeed and the one they could reply to was eliminated for redundancy.

algesten commented 8 months ago

It's not expected.

        if let Some((idx, other)) =
            self.local_candidates.iter_mut().enumerate().find(|(_, v)| {
                v.addr() == c.addr() && v.base() == c.base() && v.proto() == c.proto()
            })
        {
            if c.prio() < other.prio() {
                // The new candidate is not better than what we already got.
                debug!(
                    "Reject redundant candidate, current: {:?} rejected: {:?}",
                    other, c
                );
                return false;
            }

If addr, base and proto are the same, we might replace or reject, however for a discovered srflx I expect addr to differ.

Candidate(host=192.168.193.95:54644/udp prio=2130705663)
Candidate(srflx=31.133.129.248:54644/udp base=192.168.193.95:54644 prio=1686109951)

I.e. for host, addr should be 192.168.193.95:54644 and for the srflx it should be 31.133.129.248:54644 which means the above code shouldn't kick in. Seems like a bug.

thomaseizinger commented 8 months ago

which means the above code shouldn't kick in. Seems like a bug.

The above code is for candidates. The candidate is accepted correctly, it is the candidate pair that is rejected as redundant and thus never tested.

algesten commented 8 months ago

Ah, got ya.

This behavior changed with 51acd55f46a48a8a416d1343c96fdad2bfd0a02e

algesten commented 8 months ago

https://datatracker.ietf.org/doc/html/rfc8445#section-5.1.3

Frequently, a server-reflexive candidate and a host candidate will be
   redundant when the agent is not behind a NAT.  A candidate is
   redundant if and only if its transport address and base equal those
   of another candidate.  The agent SHOULD eliminate the redundant
   candidate with the lower priority.

Is this what we see?

thomaseizinger commented 8 months ago

Ah, got ya.

This behavior changed with 51acd55

Yeah I had a hunch that we may have broken that in that commit.

algesten commented 8 months ago

Question is whether we broke it in a more spec compliant way? :)

thomaseizinger commented 8 months ago

https://datatracker.ietf.org/doc/html/rfc8445#section-5.1.3

Frequently, a server-reflexive candidate and a host candidate will be
   redundant when the agent is not behind a NAT.  A candidate is
   redundant if and only if its transport address and base equal those
   of another candidate.  The agent SHOULD eliminate the redundant
   candidate with the lower priority.

Is this what we see?

This is still about candidates though, right? If I read the logs correctly, the candidate isn't redundant but the generated pair is. I'll take a look at the code later.

We may have broken the redundancy check of pairs with the change of what base means for srflx candidates.