Closed yuta0801 closed 3 years ago
Hi, thanks for the bug report. I'm pretty confident I tested it at some point against youtube but maybe not :(.
Based on the output it appears like Youtube is sending a non-zero 4 byte value as the 2nd portion of the packet 1, which most likely means their implementation of the fp9 digest handshake isn't what my code expects.
Unfortunately, I have been out of the live streaming space for well over a year and I'm not currently set up to test this at the current point in time (and a bit busy with real life stuff).. Not sure when I will be able to get to it unfortunately.
Hi, I did some investigation and found that if I remove that line, client handshaking to youtube will be completed.
@KallDrexx, why reader.read_u32
is used twice?
Moreover, about "non-zero 4 byte" here are some screenshots:
Interesting find!
I just looked through the RTMP specification I used while developing this library, and section 5.2
What I can tell is that the version the next line is pointing to is not the protocol version, it's the "Adobe Version" that many servers are expecting the set of 4 bytes. The RTMP 1.0 specification says this 4 bytes should be zero, however it appears that behavior has changed since this spec was made and now clients and servers are expecting it to have the adobe flash version being used (iirc this was determined by me wire-sharking other rtmp clients and servers). You can see where I encode this here.
So the format of the P1 packet (which is S1 or C1 in the spec) is a time field, then zero field (what we put that adobe version in), then the "random" data field. Based on comments elsewhere in the code the line you highlighted is meant to ignore the time field (because everyone returns time of zero and it didn't seem useful).
I would expect the handshake would fail with removing that line, because otherwise the random data won't validate. If I remember correctly the random data they send in P1 I must send back in my P2 response and it must 100% match for the peer to approve the handshake. So if you comment out that line then I would expect that the random data to be offset and it would fail for any non-fp9 handshake, except I now see that the reader isn't used for pulling the random data out.
OH ok so I see (sorry for stream of consciousness :D) It appears commenting out that line ends up working by accident.
So the reason commenting the line works is because without that first line it's reading the version out of the "time" field, which is always zero in the real world. This version is traditionally used by other servers and clients to know if it should use the fp9 handshake or not (handshake that does a key exchange).
When a peer tries to connect if it sends a zero version in the zero field (NOT protocol version sent in packet 0, confusing I know) then it should be assumed that that peer does not do a key exchange and therefore the peer is expecting you to return the data back as-is with no modifications. If the peer sends an adobe version in the zero field of packet 1 then that peer is intending to do a key exchange type of handshake.
The way I wrote the code doesn't really use that field for determining if it should use a key exchange or not. Instead I try to derive the digest keys / hmac values from the "random data" payload. If we can compute digest keys that pass then we assume that's how it was meant to be and proceed with the fp9 handshake. Otherwise we default back to the original 1.0 spec handshake (return random data 'as-is'), but we only do that if the version we read is zero like in the original RTMP spec (otherwise we assume they meant an fp9 handshake but the keys couldn't be computed).
So commenting out that line of code you mentioned "works" because when we fail computing the fp9 handshake keys we have a "adobe version" of zero (because we read the time field) and therefore we assume it's an original RTMP 1.0 handshake.
That's a long way to say, it appears that the fp9 handshake code is not deriving Youtube's digest keys property and that seems to be the "root" issue.
The question is, when you comment out the line you pointed out, does it actually work and allow you to push video to Youtube? I would expect Youtube to reject our handshake because they tried a fp9 handshake and we instead responded with a 1.0 handshake. Maybe they are just backwards compatible and can handle either response?
Thank you for your answer.
After long searching, I decided to hardcode youtube response for now in my project 😔
But it is worth to come back to it and to look into some others open source rtmp libs.
Yes for sure. Thanks to your finding I will try to carve out some time and see if I can get hints from some other open source RTMP libraries to see what I'm doing wrong.
I just ran into this as well, and adding the special case for youtube worked for me. Using request_connection("live2")
instead of app
made the stream succeed 💯
Since this code path is already a fallback, wdyt of adding this special case? I can make a quick PR
Also I ran wireshark on an OBS -> Youtube handshake and can confirm C1 == S2 and S1 == C2.
I just ran into this as well, and adding the special case for youtube worked for me. Using request_connection("live2") instead of app made the stream succeed 💯
Sorry my brain is a bit fried (newborn) but I don't totally understand what you mean, or rather how the two parts of this sentence correlate. Do you mean you got it working by detecting live2
as the app name, and if that's found then you know it's youtube and do the special case?
Also I ran wireshark on an OBS -> Youtube handshake and can confirm C1 == S2 and S1 == C2.
Ok so that tells me that Youtube isn't doing the FP9 handshake, they are doing the original RTMP handshake. So the reason this is failing for Youtube is because while the original RTMP spec says the 2nd set of 4 bytes should all be zeros, Youtube is sending some other value for that.
If you have wireshark already set up for this would you mind telling me what the 2nd set of 4 bytes are that Youtube sends? I suspect the true fix for this is going to be one of two things:
The first option is to replace this code with something like:
let received_digest = match get_digest_for_received_packet(&received_packet_1, &p1_key) {
Ok(digest) => digest,
Err(HandshakeError{kind: HandshakeErrorKind::UnknownPacket1Format}) => {
// Since no digest was found chances are that
// this handshake is not a fp9 handshake, but instead is the handshake from the
// original RTMP specification. If that's the case then this isn't an error,
// we just need to send back an exact copy of their p1 and we are good.
self.current_stage = Stage::WaitingForPacket2;
return Ok(HandshakeProcessResult::InProgress {response_bytes: received_packet_1.to_vec()});
},
Err(x) => return Err(x),
};
This essentially gets rid of the version check and makes the assumption that if we fail the digest/fp9 check but this isn't a valid RTMP handshake then that will get caught in the S2/C2 validations. In other words, we will send back the exact copy of the inbound data as a response, the other side was expecting an fp9 response instead and will kill the connection. However, in non-rtmp scenarios where we've accidentally had the right set of packets to get this far we might hang forever (until a timeout) because we won't ever have detected that this really wasn't an RTMP connection. I think the risk of this is low though.
The alternative is to make the if statement if version == 0 || version == XXXX
where XXXX
is the version number that Youtube is sending back. That's a bit safer but it feels a bit whackamole to me.
Since you are willing to do a PR (and thus theoretically setup for testing it), if you don't mind doing one of those options, testing it, and raising a PR I'll be glad to approve either way you go.
Sorry my brain is a bit fried (newborn) but I don't totally understand what you mean, or rather how the two parts of this sentence correlate. Do you mean you got it working by detecting
live2
as the app name, and if that's found then you know it's youtube and do the special case?
I meant adding the special case for version == 0x04_00_00_01
made the handshake complete successfully, and using request_connection("live2")
instead of request_connection("app")
made the full stream progress successfully.
If you have wireshark already set up for this would you mind telling me what the 2nd set of 4 bytes are that Youtube sends?
For completeness here's the whole handshake:
The "version" bytes sent by YouTube are indeed 0x04_00_00_01
.
I suspect the true fix for this is going to be one of two things:
The first option is to replace this code with something like:
let received_digest = match get_digest_for_received_packet(&received_packet_1, &p1_key) { Ok(digest) => digest, Err(HandshakeError{kind: HandshakeErrorKind::UnknownPacket1Format}) => { // Since no digest was found chances are that // this handshake is not a fp9 handshake, but instead is the handshake from the // original RTMP specification. If that's the case then this isn't an error, // we just need to send back an exact copy of their p1 and we are good. self.current_stage = Stage::WaitingForPacket2; return Ok(HandshakeProcessResult::InProgress {response_bytes: received_packet_1.to_vec()}); }, Err(x) => return Err(x), };
This essentially gets rid of the version check and makes the assumption that if we fail the digest/fp9 check but this isn't a valid RTMP handshake then that will get caught in the S2/C2 validations. In other words, we will send back the exact copy of the inbound data as a response, the other side was expecting an fp9 response instead and will kill the connection. However, in non-rtmp scenarios where we've accidentally had the right set of packets to get this far we might hang forever (until a timeout) because we won't ever have detected that this really wasn't an RTMP connection. I think the risk of this is low though.
The alternative is to make the if statement
if version == 0 || version == XXXX
whereXXXX
is the version number that Youtube is sending back. That's a bit safer but it feels a bit whackamole to me.Since you are willing to do a PR (and thus theoretically setup for testing it), if you don't mind doing one of those options, testing it, and raising a PR I'll be glad to approve either way you go.
I agree with your analysis here, the special case is a bit whack-a-mole and the potential downside of a timeout seems small. I'll try removing the version check and make sure twitch/youtube still work!
I trying to push rtmp stream to YouTube Live using mio_rtmp_server, but it sccors UnknownPacket1Format error. How can I deliver to YouTube Live?