Closed lattice0 closed 3 years ago
I changed
let control = control.ok_or_else(|| "no control url".to_string())?;
to
let control = control.unwrap_or(request_url.clone());
and it at least does the SETUP, but it looks like it sends the entire uri: uri="rtsp://192.168.1.198:10554/tcp/track0"
and the camera does not like that and simply ignores.
my client (works):
SETUP rtsp://192.168.1.198:10554/tcp/av0_0/track0 RTSP/1.0
Authorization: Digest username="admin", algorithm="MD5", realm="RTSPD", nonce="8s63k7h3mn6aa8o1u7mix4599qku0ra6", uri="/tcp/av0_0", response="4f6333f7580f407195eb2436a5905704"
CSeq: 4
Transport: RTP/AVP/TCP;interleaved=0
User-Agent: RRTSP Client
yours (no response):
SETUP rtsp://192.168.1.198:10554/tcp/track0 RTSP/1.0
Authorization: Digest username="admin", realm="RTSPD", nonce="7b3q3ssplskhy59wu77345tewnrx5see", uri="rtsp://192.168.1.198:10554/tcp/track0", response="c2438d1f4c1e56003003fa0878d1d094"
CSeq: 3
Transport: RTP/AVP/TCP;unicast;interleaved=0-1
User-Agent: moonfire-rtsp test
x-Dynamic-Rate: 1
or maybe the problem is the missing av0_0
on yours
is the control really needed?
According to RFC 2326? I think so. From sections C.1 and C.2, it looks like there should be a session-level control
if aggregate control is possible, eg a single PLAY
command that starts all the streams (that have been set up). You quoted a presentation with a video and audio stream that I imagine are supposed to be started with a single PLAY
. So I think the server is not compliant with RFC 2326.
But this isn't the first broken camera I've seen, and I prefer actually working with real cameras over strict RFC compliance, so I'd like to fix this.
How did your client get the av0_0
component? I don't see that in the DESCRIBE
response; what was the original DESCRIBE
URL?
This is a well sold camera on aliexpress (VStarcam) that worked with all the clients I tested, specifically with ZLMediakit in C++.
rtsp://192.168.1.198:10554/tcp/av0_0
is the original URL gathered from ONVIF calls.
ok, turns out that in https://github.com/scottlamb/retina/blob/b1db9a9e8b94ff7077050cc26be0a50cbf1bd58e/src/client/parse.rs#L289, we have:
} else if a.key == "control" {
control = a
.value
.as_deref()
.map(|c| join_control(base_url, c))
.transpose()?;
alt_control = a
.value
.as_deref()
.map(|c| join_control(alt_base_url, c))
.transpose()?;
}
base_url
is rtsp://192.168.1.198:10554/tcp/av0_0
while alt_base_url
is rtsp://192.168.1.198:10554/tcp/av0_0/
. When join_control(______, c)
is called on these URLs, in the first one, base_url
, we get rtsp://192.168.1.198:10554/tcp/track0
, which is wrong. In the second one, we get rtsp://192.168.1.198:10554/tcp/av0_0/track0
, which is right and makes the camera server respond to the SETUP.
What you think should be done? Is calling join
on a URL without /
in the end simply replacing the last url 'node'?
I changed to
control = a
.value
.as_deref()
.map(|c| join_control(alt_base_url, c))
.transpose()?;
that is, I ignored base_url
and setted no alt_control
, and it worked. At least up to the PLAY:
thread 'retina_client_cam1' panicked at 'called
Result::unwrap()on an
Errvalue: RtspResponseError { conn_ctx: ConnectionContext { local_addr: 172.17.0.2:40216, peer_addr: 192.168.1.198:10554, established_wall: WallTime(Timespec { sec: 1627326108, nsec: 249319267 }), established: Instant { tv_sec: 331582, tv_nsec: 624545829 } }, msg_ctx: RtspMessageContext { pos: 782, received_wall: WallTime(Timespec { sec: 1627326109, nsec: 547855062 }), received: Instant { tv_sec: 331583, tv_nsec: 923080738 } }, method: Play, cseq: 4, status: Ok, description: "PLAY response has no RTP-Info header" }', /home/dev/orwell/liborwell/src/rtsp/retina_client.rs:141:40
but I'll investigate this after I finish fixing the url control error.
So my interpretation of joining the urls is to do it as in the URL/HTTP specifications. I'm on my phone so I can't quote them now. But other clients—at least ffmpeg—just append a / to the base and concatenate no matter what. They don't even look for a ?
that starts a query string first. I think this is wrong but it seems consistent with what some cameras expect. My alt_control
was an attempt to meet their expectations when interpreting the RTP-Info
of the PLAY
response. I'm not sure what we should do when deciding what URL to send for SETUP
. Maybe we should just do the more popular thing even if seems wrong, or have a quirks/options/policy knob to let the caller decide (maybe based on a config file), or try one and then the other on error. /shruggie I'm open to ideas...
Do you mean that some cameras actually expect rtsp://192.168.1.198:10554/tcp/av0_0
to be rtsp://192.168.1.198:10554/tcp/track0
when a control attribute is chosen?
I think ffmpeg approach is ok, it's the best chance of success, because lots of cameras might actually get tested on it.
I don't remember what the RFC says about joining the URL, and I don't even remember reading about the control attribute. I just made it work on my client by reading other clients.
I'm at a computer now so I can actually look up the relevant bits.
In this case, I mean that I wrote it based on what RFC 2326 says, and I found that some cameras (and ffmpeg) differed from that. Some cameras' URLs are structured such that both interpretations produce the same result. Some will accept either method (Dahua example below). I'm not sure if there are actually cameras where ffmpeg's approach doesn't work properly.
Going back to the standards:
url::Url::join
implements this correctly, so that's what I'm using.With a base URL of rtsp://192.168.1.198:10554/tcp/av0_0
, the RFCs say the stream URL should be rtsp://192.168.1.198:10554/tcp/track0
. But you tried that and it didn't work.
ffmpeg does this (in libavformat/rtsp.c
):
/* XXX: may need to add full url resolution */
av_url_split(proto, sizeof(proto), NULL, 0, NULL, 0,
NULL, NULL, 0, p);
if (proto[0] == '\0') {
/* relative control URL */
if (rtsp_st->control_url[strlen(rtsp_st->control_url)-1]!='/')
av_strlcat(rtsp_st->control_url, "/",
sizeof(rtsp_st->control_url));
av_strlcat(rtsp_st->control_url, p,
sizeof(rtsp_st->control_url));
} else
av_strlcpy(rtsp_st->control_url, p,
sizeof(rtsp_st->control_url));
It's wrong but it works.
Dahua cameras are an interesting example. They set a base URL of eg rtsp://192.168.5.111:554/cam/realmonitor?channel=1&subtype=1&unicast=true&proto=Onvif/
with a control URL of eg trackID=1
, so:
rtsp://192.168.5.111:554/cam/trackID=1
, which works.rtsp://192.168.5.111:554/cam/realmonitor?channel=1&subtype=1&unicast=true&proto=Onvif/trackID=1
, which doesn't match the RFC, but also works.I imagine they went out of their way to accept either, although who knows. Their code might just look for the URL ending with trackID=1
, so both work by accident.
One thing writing this library has taught me: many RTSP implementations are garbage. But we should be able to make most cameras work anyway.
I think there's no harm in trying both ways and picking the first that works. If we choose the same as ffmpeg then it might break for other cameras, I guess. However I don't know if this would be a good idea for all types of quirks, as it would generate a bunch of tries.
OR, we should see what live555 does. I guess it's the most used library? Maybe all these cameras were tested in it?
Good point about live555. Looks like this is the code in question:
// Figure out what the URL describing "subsession" will look like.
// The URL is returned in three parts: prefix; separator; suffix
//##### NOTE: This code doesn't really do the right thing if "sessionURL()"
// doesn't end with a "/", and "subsession.controlPath()" is relative.
// The right thing would have been to truncate "sessionURL()" back to the
// rightmost "/", and then add "subsession.controlPath()".
// In practice, though, each "DESCRIBE" response typically contains
// a "Content-Base:" header that consists of "sessionURL()" followed by
// a "/", in which case this code ends up giving the correct result.
// However, we should really fix this code to do the right thing, and
// also check for and use the "Content-Base:" header appropriately. #####
prefix = sessionURL(subsession.parentSession());
if (prefix == NULL) prefix = "";
suffix = subsession.controlPath();
if (suffix == NULL) suffix = "";
if (isAbsoluteURL(suffix)) {
prefix = separator = "";
} else {
unsigned prefixLen = strlen(prefix);
separator = (prefixLen == 0 || prefix[prefixLen-1] == '/' || suffix[0] == '/') ? "" : "/";
}
It's effectively the same as ffmpeg, although the comment confirms my understanding that it's incorrect.
Maybe all the major client implementations have been wrong for so long that cameras expect that instead of the behavior specified in the RFC.
It feels wrong IMHO to continue perpetuating this so maybe we should try the correct URL first and then the other one if that fails.
Ok so I guess I'll make this work.
But first we have to device what to do on
let control = control.ok_or_else(|| "no control url".to_string())?;
Do you agree with
let control = control.unwrap_or(request_url.clone());
?
Then, for
let url = stream
.control
.as_ref()
.unwrap_or(&self.state.presentation.control)
.clone();
let mut req =
rtsp_types::Request::builder(rtsp_types::Method::Setup, rtsp_types::Version::V1_0)
.request_uri(url)
.header(
rtsp_types::headers::TRANSPORT,
format!(
"RTP/AVP/TCP;unicast;interleaved={}-{}",
proposed_channel_id,
proposed_channel_id + 1
),
)
.header(crate::X_DYNAMIC_RATE.clone(), "1".to_owned());
if let Some(ref s) = self.state.session_id {
req = req.header(rtsp_types::headers::SESSION, s.clone());
}
let (msg_ctx, cseq, response) = self.conn.send(&mut req.build(Bytes::new())).await?;
what about this:
let url1 = stream
.control
.as_ref()
.unwrap_or(&self.state.presentation.control)
.clone();
let url2 = stream
.alt_control
.as_ref()
.unwrap_or(&self.state.presentation.control)
.clone();
let req =
rtsp_types::Request::builder(rtsp_types::Method::Setup, rtsp_types::Version::V1_0)
.header(
rtsp_types::headers::TRANSPORT,
format!(
"RTP/AVP/TCP;unicast;interleaved={}-{}",
proposed_channel_id,
proposed_channel_id + 1
),
)
.header(crate::X_DYNAMIC_RATE.clone(), "1".to_owned());
let mut req1 = req.clone().request_uri(url1);
let mut req2 = req.clone().request_uri(url2);
if let Some(ref s) = self.state.session_id {
req1 = req1.header(rtsp_types::headers::SESSION, s.clone());
req2 = req2.header(rtsp_types::headers::SESSION, s.clone());
}
let (msg_ctx, cseq, response) = {
first_of(
self.conn.send(&mut req1.build(Bytes::new())),
self.conn.send(&mut req2.build(Bytes::new())),
)
};
where first_of
is that tokio thing that resolves the first successful stream?
ps:
alt_control: Option<Url>,
to
pub alt_control: Option<Url>,
Do you agree with
let control = control.unwrap_or(request_url.clone());
?
Sounds good.
What about [...] where
first_of
is that tokio thing that resolves the first successful stream?
tokio::select!
?
Maybe we should do it sequentially with the correct URL first and the commonly used one second? I can imagine a really confusing situation where there's some user-visible difference based on which wins the race. Also, I've learned recently that tokio::select!
can be surprisingly tricky to get right in a couple ways. First, it's a borrow checker struggle when both futures need the same underlying value (self.conn
). Second, if you potentially drop one future when the other completes, you get a strange behavior, as mentioned in this blog post.
The problem is that my camera never responds when the url is not ended in /
. So we'd have to wait some seconds just to be sure that the device did not respond. And even then it could be just a network delay. So checking for the first URL to just then check the second is not very nice.
Indeed tokio::select!
would be tricky and now thinking, it could be that the url not ending with /
times out before the url ending in /
returns, thus selecting the one not ending with /
.
What do you suggest?
The problem is that my camera never responds when the url is not ended in
/
.
Oh, that's horrible.
Argh, maybe we should just give in then and do what the other implementations do; it might be an easier road. If we encounter a camera that expects otherwise, then we can introduce the option to allow the caller to select how we construct the control URL.
Ok so what about this:
} else if a.key == "control" {
control = a
.value
.as_deref()
.map(|c| join_control(alt_base_url, c))
.transpose()?;
alt_control = a
.value
.as_deref()
.map(|c| join_control(alt_base_url, c))
.transpose()?;
}
and the
let control = control.unwrap_or(request_url.clone());
modification?
it's working here up to that h264 parsing error which I'll investigate soon
Effectively, yeah, always using joining with what's now the alt_base_url
sounds good. We don't need to keep both control
and alt_control
if they're the same, and likewise don't need to keep both base_url
and alt_base_url
.
And yeah, the unwrap_or
sounds good also.
@scottlamb I left alt_control
because you said it's used on other things that I didn't look yet.
Looks like it's used here only:
if presentation.streams.len() == 1 {
// The server is allowed to not specify a stream control URL for
// single-stream presentations. Additionally, some buggy
// cameras (eg the GW Security GW4089IP) use an incorrect URL.
// When there is a single stream in the presentation, there's no
// ambiguity. Be "forgiving", just as RFC 2326 section 14.3 asks
// servers to be forgiving of clients with single-stream
// containers.
// https://datatracker.ietf.org/doc/html/rfc2326#section-14.3
stream = Some(&mut presentation.streams[0]);
} else {
stream = presentation
.streams
.iter_mut()
.find(|s| matches!(&s.control, Some(u) if u == &url));
// If we didn't find a stream, try again with alt_control. Don't do
// this on the first pass because we should check all of the
// proper control URLs first.
if stream.is_none() {
stream = presentation
.streams
.iter_mut()
.find(|s| matches!(&s.alt_control, Some(u) if u == &url));
}
}
should I just remove if stream.is_none() {...}
?
Yeah, the is_none
branch accomplishes nothing if control
and alt_control
are always the same, so removing it SGTM.
I stared at this a bunch and so far have found at least four different ways of joining a relative control URL with its base:
SETUP
requests for this URL, so it absolutely won't work for them. GW Security cameras also send a RTP-Info
that's inconsistent with this interpretation, although that's less severe. (We can make do without RTP-Info
at all.)strcat(base_url, base_url.ends_with("/") ? "" : "/", control_url)
. This also isn't consistent with GW Security's RTP-Info
but seems to work otherwise.concat(base_path.rstrip("/"), "/", control.lstrip("/")
, so any query string is retained as-is. Also not consistent with GW Security's RTP-Info
but seems to work otherwise.RTP-Info url
). treat the base URL as if it has a trailing slash, even if it doesn't. Then join, discarding query string. (Unsure what they would do if the control URL had a leading slash; I'm guessing their algorithm from a single example.)Your PR is consistent with option 4. I think this works for all cameras I've seen but am still looking them over. This is such a confused jumble. I wouldn't be surprised if eventually we'll have to make our behavior customizable when we encounter another camera that doesn't match option 4.
I think it's also time to start writing up a document about interoperability. I'll work on that.
If we want to support all cameras possible, then sticking with what live555 does should be the best option. Also, tinycam monitor should have been tested on 1000s of models already, so it would be a nice idea to look at what it does. However it might have quirks for different models and we wouldn't know since it's closed source. All I know is that it has a way to specify a camera model when adding a new one, and it has lots of models.
Is live555 really that overwhelmingly the most accepted implementation? This isn't its only divergence from the RFC. For example, when matching RTP-Info
sections to streams, live555's client code doesn't use the url
parameter at all (as RFC 2326 section 12.33 says they should), instead doing it positionally within RTSPClient::handlePLAYResponse
. I agree it makes sense to follow common implementations where the RFC is ambiguous, but in both of these live555 is contradicting it, so I'm confused and torn. :-(
Boolean RTSPClient::handlePLAYResponse(MediaSession* session, MediaSubsession* subsession,
char const* scaleParamsStr, char const* speedParamsStr,
char const* rangeParamsStr, char const* rtpInfoParamsStr) {
// ...
while ((subsession = iter.next()) != NULL) {
u_int16_t seqNum; u_int32_t timestamp;
subsession->rtpInfo.infoIsNew = False;
if (parseRTPInfoParams(rtpInfoParamsStr, seqNum, timestamp)) {
subsession->rtpInfo.seqNum = seqNum;
subsession->rtpInfo.timestamp = timestamp;
subsession->rtpInfo.infoIsNew = True;
}
if (subsession->rtpSource() != NULL) subsession->rtpSource()->enableRTCPReports() = True; // start sending RTCP "RR"s now
}
I remember seeing live555 popping out on lots of camera related stuff that I did. What about https://github.com/ZLMediaKit/ZLMediaKit? It worked right away for my cameras.
I also think most NVRs use live555 because why not? It's the only really old open source implementation and they like old things that are released as zip files
I guess when it comes down to it, I'm fine with:
RTP-Info url=
won't match then. We can add an option to set how we join URLs later if this approach doesn't work for all cameras.and live555, tinyCam, and ZLMediaKit are good implementations to compare against, thanks. We can look into those as it comes up and add relevant things to an interop doc (#18). (I hadn't heard of ZLMediaKit but I see it at least has a bunch of stars on github.)
My VStarcam camera is failing here:
https://github.com/scottlamb/retina/blob/main/src/client/parse.rs#L424
as you can see, it has no control attribute, but it has control attributes on the media descriptions.
sdp: SessionDescription { version: 0, origin: Origin { username: "VSTC", session_id: 3836310016, session_version: 3836310016, network_type: "IN", address_type: "IP4", unicast_address: "192.168.1.198" }, session_name: "streamed by the VSTARCAM RTSP server", session_information: None, uri: None, email_address: Some("NONE"), phone_number: None, connection_information: Some(ConnectionInformation { network_type: "IN", address_type: "IP4", address: Some(Address { address: "0.0.0.0", ttl: None, range: None }) }), bandwidth: [], time_descriptions: [TimeDescription { timing: Timing { start_time: 0, stop_time: 0 }, repeat_times: [] }], time_zones: [], encryption_key: None, attributes: [], media_descriptions: [MediaDescription { media_name: MediaName { media: "video", port: RangedPort { value: 0, range: None }, protos: ["RTP", "AVP"], formats: ["96"] }, media_title: None, connection_information: None, bandwidth: [Bandwidth { experimental: false, bandwidth_type: "AS", bandwidth: 1536 }], encryption_key: None, attributes: [Attribute { key: "control", value: Some("track0") }, Attribute { key: "rtpmap", value: Some("96 H264/90000") }, Attribute { key: "fmtp", value: Some("96 packetization-mode=1;profile-level-id=4d002a;sprop-parameter-sets=Z00AKp2oHgCJ+WbgICAoAAADAAgAAAMAfCA=,aO48gA==") }] }, MediaDescription { media_name: MediaName { media: "audio", port: RangedPort { value: 0, range: None }, protos: ["RTP", "AVP"], formats: ["8"] }, media_title: None, connection_information: None, bandwidth: [Bandwidth { experimental: false, bandwidth_type: "AS", bandwidth: 64 }], encryption_key: None, attributes: [Attribute { key: "control", value: Some("track1") }, Attribute { key: "rtpmap", value: Some("8 PCMA/8000/1") }] }] }
Here's the raw response:
is the control really needed?