bluenviron / gortsplib

RTSP 1.0 client and server library for the Go programming language
MIT License
636 stars 177 forks source link

avoid returning partial RTP-Info header, omit seq/rtptime if needed #568

Closed Fusl closed 1 month ago

Fusl commented 1 month ago

A recent behavioral change introduced by PR #2244 has inadvertently caused a bug affecting the initialization timing of the RTSP server, which now waits until the first playback connection attempt. This change has resulted in playback failures in some video players due to the timing of the server's start-up sequence. The root of the issue lies in the OnSetup handler in server_session.go, which activates the RTSP server processor found in stream.go.

The server's setup process allows gortsplib to asynchronously process packets and extract stream data such as video and audio tracks. However, the immediate return of OnSetup means that the generateRTPInfo function might try to construct an RTP-Info header before all tracks have been detected by gortsplib, leading to incomplete or incorrect headers.

Currently, the server prematurely constructs RTP-Info headers based on a list from setuppedMediasOrdered. This results in headers that may lack complete track data if gortsplib has not yet processed all packets required to detect the tracks, as seen in the conditional check in server_stream.go.

This PR resolves the issue by ensuring that even if the packet data for a track isn't fully available at the time of constructing the RTP-Info header, a partial but valid RTP-Info URL entry will still be included. This modification prevents the generation of headers that could cause strict RTSP implementations to fail during playback.

Examples of RTP-Info Headers:

Valid header formats:

Invalid header formats (due to race condition, one of the tracks is missing):

Proposed fix example (for missing data cases):

Summary of the Fix:

aler9 commented 1 month ago

Hello, you're perfectly right about the fact that the RTP-Info header is often sent with one or more missing media streams (tracks), since the server must have at least a RTP packet for each media stream in order to generate it. Sometimes this packet will never be available, since a RTSP server/client is free to declare a media and never use it, therefore waiting for it would be useless, and this is why the mechanism is totally asynchronous.

The question here is whether it is better to send a partial RTP-Info entry or to omit it, when no packets are available yet.

Can you write the name of the client that prefers partial RTP-info entries with respect to no entries? unfortunately in the specifications there is no trace of such partial RTP-infos, therefore if we want to merge this, we need a reference. Thanks

Fusl commented 1 month ago

Windows Media Foundation is the problematic application here:

Screenshot_20240509-092924~2.png

This is being used through NSPlayer by AVPro Video which in turn is being used by VRChat.

MediaMTX is a common software used by many VRChat users to stream a screen share with OBS or other media in low-latency into user-made VRChat worlds.

The problem here is that all of it relies on the stream being fully established when it's time to call back into the application to set up the player at which point most of the time we only see an audio track in the stream.

aler9 commented 1 month ago

Perfect, fix the tests and this will be merged.

Fusl commented 1 month ago

There you go, this should make the test pass. For what it's worth, my initial thought was to revert the initialization procedure to occur during the creation of the struct rather than lazily initializing when the first RTSP playback client connects. However, that approach would have rolled back some of the performance improvements we've achieved, so I opted for this solution instead. I’ll continue to explore this issue and if I find a more efficient way to address it, I’ll submit another PR.

aler9 commented 1 month ago

merged, thanks!