Fix panic on STUN binding failure while multiplexing

OxleyS commented 6 months ago

An unrelated bug I had in my application was causing STUN to fail, and a Class::Failure message to come back. Because like in the chat example, I'm still using a single multiplexed UDP socket, this message first went to the Rtc::accepts() of a separate client that had not negotiated, which caused it to panic with "Remote ICE Credentials".

This is the same underlying panic as in https://github.com/algesten/str0m/pull/461, but this time the logic was allowed to fall through to the stun_credentials() call because we were only doing the preceding candidate-pair test for Class::Success responses, not Class::Failure.

I wanted to add a test for this, but with the current field visibility on StunMessage, I couldn't create a failure message in a test. I wasn't sure if adding a helper to impl StunMessage was appropriate.

algesten commented 6 months ago

Which panic are you hitting?

OxleyS commented 6 months ago

This one, because it is a STUN response we hit the else branch here, which expects remote credentials.

algesten commented 6 months ago

I think the correct fix would be:

--- a/src/ice/agent.rs
+++ b/src/ice/agent.rs
@@ -799,6 +799,9 @@ impl IceAgent {
                 trace!("Message rejected, transaction ID does not belong to any of our candidate pairs");
                 return false;
             }
+        } else {
+            trace!("Message reject, it is not a succesful binding response");
+            return false;
         }

         let (_, password) = self.stun_credentials(!message.is_response());

OxleyS commented 6 months ago

That patch would end up rejecting everything that's not a successful binding response, since all other message types, including things like binding requests, would fall through to that else case as well.

I believe the effect you intended was to just reject all non-success binding responses? That would also fix the panic, although arguably IceAgent should be accepting all traffic intended for it, even if it doesn't react to failed responses in any special way.

(Aside: Class::Indication and Class::Unknown wouldn't trigger panics with either of our solutions).

algesten commented 6 months ago

Taking a step back.

accept should, as you say, accept all traffic targeted at this Rtc instance.

OxleyS commented 6 months ago

I think we have our bases covered here then - STUN requests get handled as before, and now both STUN success and failure gets accepted if it corresponds to a STUN request from us (or rejected before the credentials check if not).

algesten commented 6 months ago

I still don't like this change.

What kind of responses are we talking about that we suddenly accept? What about middleboxes? We should be very specific with what we accept. Enumerate the cases.

OxleyS commented 6 months ago

I believe this enumeration is exhaustive:

Method::Binding, Class::Request: Unchanged, rejects on a local/remote ufrag mismatch and then checks integrity. Cannot panic because it does not require remote credentials to integrity-check.
Method::Binding, Class::Success: Unchanged, rejects if the transaction ID matches no binding request from us, then checks integrity. If the transaction ID matches, the integrity-check would panic without remote credentials, but we do not generate binding requests without remote credentials, so panic is effectively impossible.
Any method, Class::Failure: Changed, previous behavior would only check integrity, panicking if remote credentials have not been set. New behavior matches the Class::Success case.
Method::Unknown, Class::Success: Changed, both previous and new behavior match the Class::Failure case.
Method::Unknown, Class::Request: Unchanged, only checks integrity. Cannot panic because it does not require remote credentials to integrity-check.
Any method, Class::Indication or Class::Unknown: Unchanged, only checks integrity. Cannot panic because it does not require remote credentials to integrity-check.

This PR was aimed at fixing the (Method::Binding, Class::Failure) case specifically, but it seems both (Method::Unknown, Class::Success) and (Method::Unknown, Class::Failure) had the same panic bug. This PR ~unintentionally~ incidentally prevents these two from panicking as well, by subjecting them to the same transaction-ID-match test as a binding success would be. It's hard to have a good answer for what to do with Method::Unknown, but this seems like a reasonable-enough way to handle them.

Considering the confusion that seems to exist around the exact behavior of accepts() (and I made several mistakes in the process of putting the above enumeration together!), perhaps we should consider rewriting this method to use an exhaustive match, even if it means we can't use is_response() and similar utilities.

algesten commented 6 months ago

Considering the confusion that seems to exist around the exact behavior of accepts() (and I made several mistakes in the process of putting the above enumeration together!), perhaps we should consider rewriting this method to use an exhaustive match, even if it means we can't use is_response() and similar utilities.

Agree. I also think this function should default to false at the end and explicitly enumerate the cases where we do accept the message. So it's the opposite of today really.

OxleyS commented 6 months ago

Okay, I'll take a stab at doing that tomorrow.

algesten / str0m

Fix panic on STUN binding failure while multiplexing #488