Closed PaulOlteanu closed 9 months ago
Also, unrelated but the docs for mediasoup 0.13 failed to build. I can file that as a separate issue if you'd like
Yes, please. With STR.
CC @nazar-pc
Yes, please. With STR.
Not sure what STR is, but I filed an issue for that.
Thanks guys :+1:
"Steps To Reproduce" ☺️
The provided error line is saying that the string is not a valid UTF8.
error: Utf8Error { valid_up_to: 66119, error_len: Some(1) } }
.
Make sure you are providing a valid UTF8. At the end of the privided string there is an invalid character in the indicated possition (after the last bear):
🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸🧸�{"id":"657eb592-6fd1-4877-afb4-908c7c9072e8","chat":"qegrwewegw","media":[]}
Reopening as we should not panic in this case.
Yeah, sorry for not really explaining. I am sending user-generated text over a data channel. I have filtering to not allow for 40,000 bear emojis now but in general I would expect some kind of error rather than a panic.
Hm, Rust API should either work with Strings (which are guaranteed to be valid UTF-8 unless we incorrectly use unchecked unsafe methods) or raw bytes that can be anything. Do we mix them somewhere?
@PaulOlteanu are you sending strings or bytes via available Rust API?
My flow is I receive data on the server via a direct data consumer's on_message
, validate it, then forward it via direct data producer. This would've been created via WebRtcMessage::String(message)
on my end.
I believe I saw this panic happen once where I would not have forwarded it via data producer, which leads me to believe this happens when receiving data on the server from the client.
So if it was a string when you sent it and you didn't use any unsafe to create a string from raw bytes, it must be a memory corruption or mishandling somewhere on mediasoup's end. That is a useful information, thanks!
If you can manage to create a reproducible app (maybe by modifying one of the examples) that'd help a lot with debugging and fixing.
I tried modifying the rust echo example to reproduce this but I can't get it to happen just from the echo example client. I'm trying to similarly send 40,000 teddy bear emojis, but it seems to handle it fine. I tried with both mediasoup 0.12 which I was using at the time, and the master branch. I also tried with both a direct consumer on the server and a consumer over a webrtc transport
The client that caused this panic originally just sends the data with dataProducer.send(JSON.stringify(message))
, (which makes sense because the server is trying to interpret the message as a WebRtcMessage::String
), so I'm not really sure where the invalid bytes are coming from.
Is there something that can be done here?
Is it "expected" that mediasoup Rust crashes (due to unhandled error) if a DataChannel message of type string with invalid UTF-8 data is received? I mean, that would be terrible since it means that any remote user could crash it.
AFAIS that's what is happening here. In data_structures.rs
:
impl<'a> WebRtcMessage<'a> {
// +------------------------------------+-----------+
// | Value | SCTP PPID |
// +------------------------------------+-----------+
// | WebRTC String | 51 |
// | WebRTC Binary Partial (Deprecated) | 52 |
// | WebRTC Binary | 53 |
// | WebRTC String Partial (Deprecated) | 54 |
// | WebRTC String Empty | 56 |
// | WebRTC Binary Empty | 57 |
// +------------------------------------+-----------+
pub(crate) fn new(ppid: u32, payload: Cow<'a, [u8]>) -> Result<Self, u32> {
match ppid {
51 => Ok(WebRtcMessage::String(
String::from_utf8(payload.to_vec()).unwrap(),
)),
53 => Ok(WebRtcMessage::Binary(payload)),
56 => Ok(WebRtcMessage::EmptyString),
57 => Ok(WebRtcMessage::EmptyBinary),
ppid => Err(ppid),
}
}
If ppid
is 51 (WebRTC string, AKA UTF-8 string), the data is given to String::from_utf8()
which will just throw if no valid UTF-8 is given. We should be ready to catch that error.
Assuming worker and library work correctly, it will never panic. But something somewhere was mishandled. There is either logic error in worker or memory safety issue or something else somewhere that causes this. Handling error is not too bad of an idea, but the root cause still needs to be fixed.
@nazar-pc I don't fully understand some points here, so I'll write specific questions taking into account the following fact:
The remote endpoint perfectly could send invalid UTF8 data into a ppid
51 (AKA WebRTC UTF8 string) in a DataChannel message. So questions:
String::from_utf8(data)
throws, will the Rust process crash or generate an unhandled exception that would make the Rust app exit?String::from_utf8()
it will throw, how could that produce a panic? And how/where could be a memory safety issue in Rust?The remote endpoint perfectly could send invalid UTF8 data into a ppid 51 (AKA WebRTC UTF8 string) in a DataChannel message
Technically yes, but it shouldn't really be possible if application is written fully in Rust and reliable data channels are used. But it is possible with unreliable data channels indeed.
If that happens and String::from_utf8(data) throws, will the Rust process crash or generate an unhandled exception that would make the Rust app exit?
Panic will be generated. Panic does crash an app by default, but it is possible to catch it too (though I'm not sure if that'll work well across FFI boundary).
I promise I spent lot of time yesterday trying to apply a try/catch Rust thing and I couldn't figure out how to do it.
For panics https://doc.rust-lang.org/std/panic/fn.catch_unwind.html can be used if you can wrap the caller or https://doc.rust-lang.org/std/panic/fn.set_hook.html if you can't, but still want to catch it (I would strongly not recommend that).
Now that I thought about some more, I think it might be related to unreliable data channels, in which case perfectly valid UTF-8 can be corrupted and become invalid. @PaulOlteanu can you confirm this theory?
If that is the case, I think what we should do is to introduce a wrapper type that will be either a string if it is valid or just bytes when it is not possible to construct a valid UTF-8 string.
Something like this:
enum MaybeString {
String(String),
Bytes(Vec<u8>),
}
Then WebRtcMessage::String
would actually hold MaybeString
and user will have to decide what to do with those bytes if they are not a valid UTF-8 string. Alternatively we could just always use WebRtcMessage::String(Vec<u8>)
and force user to convert them into string if necessary and deal with this edge-case there.
You say that there is a memory safety issue in the code. Do you mean in Rust code, right?
I do not see safety issues in Rust code right now, so that would be in C++ land, but unreliable data channels seem like the most likely reason here and only application can decide what to do in that case (could be to ignore the message, recover valid bytes, etc.).
The remote endpoint perfectly could send invalid UTF8 data into a ppid 51 (AKA WebRTC UTF8 string) in a DataChannel message
Technically yes, but it shouldn't really be possible if application is written fully in Rust and reliable data channels are used.
What application? You mean the remote endpoint? It doesn't matter. A remote endpoint should not be able to crash a server no matter in which language it's written.
But it is possible with unreliable data channels indeed.
See OnUsrSctpReceiveSctpData()
in SctpAssociation.cpp
. Unreliable datachannel messages can arrive in wrong order or some of them could be lost. So if a big UTF8 string is sent into N SCTP chunks, and some of those chunks are lost, the resulting string could indeed be invalid UTF8.
If that happens and String::from_utf8(data) throws, will the Rust process crash or generate an unhandled exception that would make the Rust app exit?
Panic will be generated. Panic does crash an app by default, but it is possible to catch it too (though I'm not sure if that'll work well across FFI boundary).
What do you mean with "FFI boundary"? Perhaps I don't understand the scenario correctly but AFAIU this is:
I promise I spent lot of time yesterday trying to apply a try/catch Rust thing and I couldn't figure out how to do it.
For panics https://doc.rust-lang.org/std/panic/fn.catch_unwind.html can be used if you can wrap the caller or https://doc.rust-lang.org/std/panic/fn.set_hook.html if you can't, but still want to catch it (I would strongly not recommend that).
Ok, I'll try to add the former.
If that is the case, I think what we should do is to introduce a wrapper type that will be either a string if it is valid or just bytes when it is not possible to construct a valid UTF-8 string.
Nope. In WebRTC, DataChannel messages can be "WebRTC UTF8 String" (SCTP PPID 51) or "WebRTC Binary" (SCTP PPID 53). Well, there are more types:
+-------------------------------+----------+
| Value | SCTP |
| | PPID |
+-------------------------------+----------+
| WebRTC String | 51 |
| WebRTC Binary Partial | 52 |
| (Deprecated) | |
| WebRTC Binary | 53 |
| WebRTC String Partial | 54 |
| (Deprecated) | |
| WebRTC String Empty | 56 |
| WebRTC Binary Empty | 57 |
+-------------------------------+----------+
The receiver should not do magic if data is invalid. The sender should specify PPID 53 if the content of the message is not UTF8 string. In receiver side, it should ignore the received message if is invalid, which can only happen if it's WebRTC string with invalid UTF8 (due to whatever).
You say that there is a memory safety issue in the code. Do you mean in Rust code, right?
I do not see safety issues in Rust code right now, so that would be in C++ land
Any evidence that there could be a problem in C++ side? The log in the issue clearly points to Rust. I may be wrong but AFAIU this is just about the mediasoup Rust code not being ready to catch errors when decoding invalid UTF8.
What application? You mean the remote endpoint? It doesn't matter. A remote endpoint should not be able to crash a server no matter in which language it's written.
Yeah, you're right actually.
What do you mean with "FFI boundary"?
C++ code calls Rust function, if Rust function panics (like in this case) C++ code will have no idea what to do with that panic, it will not be able to handle it gracefully because it is a different language with different convention. Just like it is not easy to catch C++ exception in Rust.
Ok, I'll try to add the former.
You don't need to though if it is related to unreliable data channel.
Nope. In WebRTC, DataChannel messages can be "WebRTC UTF8 String" (SCTP PPID 51) or "WebRTC Binary" (SCTP PPID 53). Well, there are more types:
Well, then it is a bug in the worker that it sends strings, which are not strings, back to Rust?
In receiver side, it should ignore the received message if is invalid, which can only happen if it's WebRTC string with invalid UTF8 (due to whatever).
Are you sure ignoring is the right thing though? It might be very hard to debug if some messages just disappear. We should at least have a warning for this.
Any evidence that there could be a problem in C++ side? The log in the issue clearly points to Rust. I may be wrong but AFAIU this is just about the mediasoup Rust code not being ready to catch errors when decoding invalid UTF8.
Rust side gets incorrect bytes, they come from C++ and I see no bugs in very limited Rust code that would corrupt in-memory bytes, so it is the most likely that something in the worker is not slicing bytes properly or corrupts memory somewhere. Most of Rust code is safe where such things are not possible.
What do you mean with "FFI boundary"?
C++ code calls Rust function, if Rust function panics (like in this case) C++ code will have no idea what to do with that panic, it will not be able to handle it gracefully because it is a different language with different convention. Just like it is not easy to catch C++ exception in Rust.
Yes, clear. That's why I always assumed this must be done when the error happens.
Ok, I'll try to add the former.
You don't need to though if it is related to unreliable data channel.
Why not? We don't want the server app to crash.
Nope. In WebRTC, DataChannel messages can be "WebRTC UTF8 String" (SCTP PPID 51) or "WebRTC Binary" (SCTP PPID 53). Well, there are more types:
Well, then it is a bug in the worker that it sends strings, which are not strings, back to Rust?
Here the thing, in worker we don't validate the UTF8 content of ppid 51 received or sent messages. In fact, in Node we do not even emit a string but a Buffer
. The Node app can convert it to UTF8 string later:
https://mediasoup.org/documentation/v3/mediasoup/api/#dataConsumer-on-message
I'm not sure what's the right way to go is here. Should we validate UTF8 messages in worker side?
In receiver side, it should ignore the received message if is invalid, which can only happen if it's WebRTC string with invalid UTF8 (due to whatever).
Are you sure ignoring is the right thing though? It might be very hard to debug if some messages just disappear. We should at least have a warning for this.
Yes, I also said somewhere else that we should log a warning. But we should not expose invalid UTF8 strings to the user app (despite we do in Node). As said above this deserves a proper decision.
Any evidence that there could be a problem in C++ side? The log in the issue clearly points to Rust. I may be wrong but AFAIU this is just about the mediasoup Rust code not being ready to catch errors when decoding invalid UTF8.
Rust side gets incorrect bytes, they come from C++ and I see no bugs in very limited Rust code that would corrupt in-memory bytes, so it is the most likely that something in the worker is not slicing bytes properly or corrupts memory somewhere. Most of Rust code is safe where such things are not possible.
Yes, like it or not but there is no way to know if a chunk has been lost in the API exposed by usrsctp. See OnUsrSctpReceiveSctpData()
in SctpAssociation.cpp
. There is no "chunk sequence number" at all.
The Node app can convert it to UTF8 string later I'm not sure what's the right way to go is here. Should we validate UTF8 messages in worker side?
I think doing the same in Rust makes sense to me. Yes, it is a bit awkward and we can document that in case of reliable data channels it will always contain valid UTF-8 string, but move the responsibility to actually convert it and handle errors onto the user of the library.
If we can't guarantee it is a valid UTF-8, then I think the correct API is to expose raw bytes.
we can document that in case of reliable data channels it will always contain valid UTF-8 string
This is not true. A malicious user could send a DataChannel message with ppip 51 and invalid UTF-8 bytes.
If we can't guarantee it is a valid UTF-8, then I think the correct API is to expose raw bytes.
I see tons of usages of WebRtcMessage
enum, for receiving and for sending. Not sure how it should be done.
BTW why the underscore here in _payload
?
impl DirectDataConsumer {
/// Sends direct messages from the Rust process.
pub async fn send(&self, message: WebRtcMessage<'_>) -> Result<(), RequestError> {
let (ppid, _payload) = message.into_ppid_and_payload();
self.inner
.channel
.request(
self.inner.id,
DataConsumerSendRequest {
ppid,
payload: _payload.into_owned(),
},
)
.await
}
}
This is not true. A malicious user could send a DataChannel message with ppip 51 and invalid UTF-8 bytes.
Well, then we totally should have just bytes.
BTW why the underscore here in _payload?
It is a typo, shouldn't have _
prefix.
What should I do with this?
/// Container used for sending/receiving messages using `DirectTransport` data producers and data
/// consumers.
#[derive(Debug, Clone)]
pub enum WebRtcMessage<'a> {
/// String
String(String),
/// Binary
Binary(Cow<'a, [u8]>),
/// EmptyString
EmptyString,
/// EmptyBinary
EmptyBinary,
}
Do we still need this WebRtcMessage
enum? Should we turn it into some type / alias? Or just replace it with Binary(Cow<'a, [u8]>)
everywhere?
No, we still want to know that it was supposed to be a string, so we just change String(String)
to String(Cow<'a, [u8]>)
.
No, we still want to know that it was supposed to be a string, so we just change
String(String)
toString(Cow<'a, [u8]>)
.
I don't know what that means. A string without encoding? The user must encode it into UTF-8?
I mean when message is sent or received we want to know which PPID it corresponds to, that is what enum variants represent.
Does this look good? https://github.com/versatica/mediasoup/pull/1289
I like the solution of letting us parse the bytes :+1: thanks guys.
The client that was sending these messages does set maxRetransmits
to 1, which I believe results in an unreliable data channel, so I guess that would mean the data could be getting corrupted that way?
Out of curiosity, is it also possible the ppid could get invalidated when using an unreliable channel, and thus the "wrong" enum variant would be created? (not really a problem as that can be handled, just curious)
The client that was sending these messages does set maxRetransmits to 1, which I believe results in an unreliable data channel, so I guess that would mean the data could be getting corrupted that way?
Yes. Unreliable data channels should never be used to send big content since content is split into chunks, each chunk is carried in a separate SCTP packet and some SCTP packets may get lost or arrive in wrong order.
Out of curiosity, is it also possible the ppid could get invalidated when using an unreliable channel, and thus the "wrong" enum variant would be created? (not really a problem as that can be handled, just curious)
No, the ppid cannot be lost, it's part of the SCTP message/chunk. It arrives or doesn't arrive.
Bug Report
Issue description
I'm seeing occasional panics in the call to WebRtcMessage::new because of that unwrap. I never call that function myself, so I don't know where the invalid byte(s) would be coming from
Here's a log from the panic.