scottlamb / retina

High-level RTSP multimedia streaming library, in Rust
https://crates.io/crates/retina
Apache License 2.0
244 stars 48 forks source link

VStarcam: EOF while expecting response to DESCRIBE CSeq 2 #45

Open lattice0 opened 2 years ago

lattice0 commented 2 years ago

Tried a new Vstarcam camera that I haven't tried before and got:

thread 'retina_client_cam1' panicked at 'calledResult::unwrap()on anErrvalue: RtspReadError { conn_ctx: ConnectionContext { local_addr: 172.17.0.2:57760, peer_addr: 192.168.1.140:10554, established_wall: WallTime(Timespec { sec: 1640306208, nsec: 801928570 }), established: Instant { tv_sec: 7103, tv_nsec: 209861352 } }, msg_ctx: RtspMessageContext { pos: 119, received_wall: WallTime(Timespec { sec: 1640306209, nsec: 106381244 }), received: Instant { tv_sec: 7103, tv_nsec: 514314058 } }, source: Custom { kind: UnexpectedEof, error: "EOF while expecting response to DESCRIBE CSeq 2" } }', /home/dev/orwell/liborwell/src/rtsp/retina_client.rs:132:77

capture sent in email

I have no idea what's happening but I remember we might have stumbled upon this before?

On the capture the camera basically closes connection once we receive the DESCRIBE. I think it's malformed

scottlamb commented 2 years ago

In the capture you sent, I see this:

  1. Retina sends a DESCRIBE with no authentication
  2. Camera asks for authentication
  3. Retina sends a DESCRIBE with authentication
  4. Camera drops the connection without responding

which is consistent with the error message you quoted.

Maybe the camera just always drops the connection after sending an unauthorized response. This is discouraged but allowed, [1] and clients are supposed to handle it by retrying as described in RFC 2616 section 8.1.4:

clients, servers, and proxies MUST be able to recover from asynchronous close events. Client software SHOULD reopen the transport connection and retransmit the aborted sequence of requests without user interaction so long as the request sequence is idempotent (see section 9.1.2).

Retina isn't doing this today—it hadn't come up until now—and fixing it might address this problem.

I'd be interested to see any client talking successfully with this server, to know if they're indeed retrying with a fresh connection. I wonder also if the server is just dropping the connection after the unauthorized or if does this after every response. When using TCP, they must at least be keeping the connection open after the PLAY, or interleaved data can't work. When using UDP, in theory they could drop the connection every time. Retina would need a few changes to handle this; currently it expects one connection to live for the entire session. This isn't what the RFC says, but some other servers seem to drop the session if the connection drops, and I've just followed their lead so far.

[1] doubly discouraged: servers SHOULD keep connections open even after an error, and they SHOULD send Connection: close if they decide to close the connection immediately after a response. But it doesn't say MUST in either place. They're also allowed to time out idle connections, and there's no minimum timeout. So the server's behavior isn't great but the RFCs require Retina to handle it gracefully. And besides, we'd want Retina to interop well even if the RFCs said the other party is wrong.

scottlamb commented 2 years ago

Checking in. Would it be possible to see a packet capture of any client talking successfully with this server? That would help me determine if it's worth investing the time right now in making Retina reopen the transport connection as I mentioned above.

lattice0 commented 2 years ago

I'm away from this camera and don't know when I'll be able to get access again :(

nsauter commented 11 months ago

Maybe i can help with this issue? I have a Wansview Cam with the exact same error. What exactly do you need to debug this?

nsauter commented 11 months ago

Here is a PCAP Dump of the cam via ffplay. Maybe that helps

rtsp_describe.zip

scottlamb commented 11 months ago

@nsauter Interesting! There's only one TCP stream there. So my theory was off. Implementing the retry I mentioned above probably wouldn't help.

@lattice0's original pcap with Retina looked like this:

DESCRIBE rtsp://192.168.1.140:10554/tcp/av0_0 RTSP/1.0
Accept: application/sdp
CSeq: 1
User-Agent: orwell retina

RTSP/1.0 401 Unauthorized
Cseq: 1
WWW-Authenticate: Digest realm="RTSPD",nonce="..."

DESCRIBE rtsp://192.168.1.140:10554/tcp/av0_0 RTSP/1.0
Accept: application/sdp
Authorization: Digest username="admin", realm="RTSPD", uri="rtsp://192.168.1.140:10554/tcp/av0_0", nonce="...", response="..."
CSeq: 2
User-Agent: orwell retina

(eof here)

vs the one you supplied with ffmpeg:

OPTIONS rtsp://192.168.1.69:554/live/ch0 RTSP/1.0
CSeq: 1
User-Agent: Lavf58.76.100

RTSP/1.0 200 OK
Server: AJSS/1.0.4 (Build/001.0; Platform/Linux; Release/Ajy Rtsp Svr; State/Development; )
Cseq: 1
Public: DESCRIBE, SETUP, TEARDOWN, PLAY, OPTIONS

DESCRIBE rtsp://192.168.1.69:554/live/ch0 RTSP/1.0
Accept: application/sdp
CSeq: 2
User-Agent: Lavf58.76.100

RTSP/1.0 401 Unauthorized
Server: AJSS/1.0.4 (Build/001.0; Platform/Linux; Release/Ajy Rtsp Svr; State/Development; )
Cseq: 2
WWW-Authenticate: Digest realm="HYRtspd", nonce="..."

DESCRIBE rtsp://192.168.1.69:554/live/ch0 RTSP/1.0
Accept: application/sdp
CSeq: 3
User-Agent: Lavf58.76.100
Authorization: Digest username="XryxO4S5", realm="HYRtspd", nonce="...", uri="rtsp://192.168.1.69:554/live/ch0", response="..."

...successful response here...

My new theory is that it doesn't like something about Retina's second DESCRIBE request and just drops the connection after seeing it. I suppose it'd have to be the Authorization header itself. This is well-tested code on Retina's side: I use my http-auth crate, which is standards-compliant and has over a million downloads on crates.io. But...the only thing I can think of now is the ordering:

The order shouldn't matter. But this server may not be super standards-compliant. They're using the original RFC 2069 digest form, not the updated form from RFC 2617 (written in the year 1999). RFC 7616 (written in the year 2015) dropped RFC 2069 compatibility. So they released this camera with an obsolete standard...

RFC 2069 section 2.4 gives an example with ffmpeg's ordering:

Authorization: Digest       username="Mufasa",
                            realm="testrealm@host.com",
                            nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093",
                            uri="/dir/index.html",
                            response="e966c932a9242554e42c8ee200cec7f6",
                            opaque="5ccc069c403ebaf9f0171e9517f40e41"

maybe they are expecting that.

I can throw together a patched Retina with this order to see if that makes a difference.

scottlamb commented 11 months ago

@nsauter If you try out Retina's new issue-45-experiment branch, does it work? if not, could you send me a pcap with that?

nsauter commented 11 months ago

Can you please tell me the correct test? Unfortunately i have never used the cli so i dont know what exactly should work and what not. So far i did try to execute retina mp4 --url rtsp://USER:PASSWORD@192.168.1.64:554/live/ch0 test.mp4 but that gives me the error: Fatal: Invalid argument: URL must not contain credentials

nsauter commented 10 months ago

I got help and managed to execute the corret command:

./retina mp4 --url rtsp://192.168.1.64:554/live/ch0 --username USER --password PW test.mp4

With that i recieved a video file without errors.

scottlamb commented 10 months ago

Okay. That confirms the new theory. Now the question is if other servers require the former order. I think I'll make a new http-auth release with this change and roll the dice. If it breaks some other non-compliant server, then I suppose I'll introduce options for http-auth's caller (Retina), Retina's caller, and Moonfire's user to choose. I hope that's not necessary but we'll see I guess.

Curid commented 10 months ago

So.. @nsauter was using a build from my dts branch when he got the error. I sent him build from the main branch yesterday and apparently, that build doesn't throw the error. I will make sure to test the main branch next time. Sorry for wasting your time :(