SuRGeoNix / Flyleaf

Media Player .NET Library for WinUI 3/ WPF/WinForms (based on FFmpeg/DirectX)
GNU Lesser General Public License v3.0
711 stars 98 forks source link

delay when play rtsp stream from network camera #42

Closed pubpy2015 closed 3 years ago

pubpy2015 commented 3 years ago

Hi, This issue has been reported before #18

After change some code in DemuxerBase.cs (attached file), it's better. Please review this change. In this video you can see the video delay (of original code) when controlling the camera

https://user-images.githubusercontent.com/59004953/123520524-5ceab680-d6db-11eb-9093-5b861fe7be03.mp4

image image DemuxerBase.zip

SuRGeoNix commented 3 years ago

I don't feel comfortable changing those lines without understanding the reason.

I think your issues comes from another side. Rtsp falsely reports that it has audio/sound while it doesn't so it waits for audio to come up before playing. To ensure that the issue is that try to change this line / boolean to true

https://github.com/SuRGeoNix/Flyleaf/blob/05c307b9f5898d077e5aee2a2a31c1b0ae7349ac/FlyleafLib/MediaPlayer/Player.cs#L724

pubpy2015 commented 3 years ago

This camera has audio with g726 codec. As you seen in the bottom of video screen, i am disable audio and subs (because it make more delay)

image

image

SuRGeoNix commented 3 years ago

Is it possible that the delay is coming randomly sometimes from the camera? I can't see the issue with the code there. The only thing that have in mind is possible that you open a lot of threads (players) and takes too long to create a new thread?

pubpy2015 commented 3 years ago

It happens to all cameras. if you add log here: image and image

you can see that when the demuxer reads ~50 frames, the decoder just starts decoding and in the buffering time the demuxer doesn't read the frames: image

SuRGeoNix commented 3 years ago

I think I have seen something similar in the past. With network streams, without any reason the demuxer is 'relaxing' and i thought the issue was from the other side. Additionally, the fix code that you propose doesn't make sense if that's the issue. I will have a look but not sure what I should check yet.

pubpy2015 commented 3 years ago

yes,

My purpose is reduce some Thread.Sleep(xxx) in your code, and want to thread start immediate. it's better but doesn't solve the root of the problem.

May be the reason is because you separate to demux thread and decode thread, so each thread will have to wait for each other (buffering).

SuRGeoNix commented 3 years ago

I'm sure that this Thread.Sleep is not an issue and it's only once until the thread it comes up (it could probably be about 5-10ms max). Demux / Decode run on different thread and that makes it faster, Of course Decode should wait to have demuxed packets though.

Still can't see any delay 1-2 seconds that you mention in the logs anywhere. Did you try to reduce the MinVideoFrames to reduce the buffering time?

pubpy2015 commented 3 years ago

setting player.Config.decoder.MinVideoFrames = 1 but same result. You can using this rtsp url to test: rtsp://camera:abcd1234@210.245.52.48:21454/MediaInput/H264

for compare, you can access that camera on internet explorer (require install activeX): http://210.245.52.48:21400/ image

pubpy2015 commented 3 years ago

There is one more problem that I am still confused:

I am only run your demo program on my PC: Intel Core i7 10700, RAm 32 GB, GTX 1660 but decoder.VideoDecoder.Frames.Count always = 15 (Environment.ProcessorCount = 16). If set player.Config.decoder.MaxVideoFrames = 100 => decoder.VideoDecoder.Frames.Count = 26 image so in buffer always contain 26 frames prepare to render ? is my pc can not decode and display a video stream at 1280x960 resolution?

I am test with some VMS software for ip camera before as Milestone or https://eocortex.com/, or NUUO Mainconsole https://www.nuuo.com/.... they can display > 16 camera realtime and fluently on my PC

SuRGeoNix commented 3 years ago

Decoder frames are related with the demuxer packets. If you don't have enough MaxQueueSize (demuxed packets) you will never reach the 100 video frames. The decoded videos frames queue are the frames ready to be rendered. Probably in your case with the default MaxQueueSize = 200 demuxed packets you can get 26 videos frames (the rest packets can be audio or other data).

I didn't understand the second part of you question. Generally, if the pc cannot handle the cpu/gpu for the playback you could drop frames (in case of rendering/gpu) but for the decoding you cannot do much (cpu). Even with buffering you cannot hold too many frames in memory, they are huge usually.

pubpy2015 commented 3 years ago

I mean: my PC is powerful enough to decode and display in realtime video stream of this camera (1280x960, 10fps). Why in buffer always contain 26 frame prepare for render ? Do you think 26/10 = 2.6 seconds => this is delay time ?

Please test rtsp url from message above.

SuRGeoNix commented 3 years ago

Full buffer means fast pc. The rtsp url that you say, I don't even get the frames right. I get previous timestamps. Even when i dont get previous timestamps/pts I see frames with the clock before the previous frames that i received. However, when i turn off the Video Acceleration and I also wrote some code to get only the pts that come after the current pts I was able to play the camera stable with 5 seconds difference (I think that's a setting within the camera settings. I wasn't able to install the activex to check). Opening the same with potplayer had more delay. I was still getting a lot of errors from the camera

[............] [MediaFramework] [h264 @ 06aaa980] corrupted macroblock 69 19 (total_coeff=-1)
[11.56.25.925] [FFmpeg] [h264 @ 06aaa980] error while decoding MB 69 19

I need to see why with Video Acceleration previous frames are coming next, but I don't see any delay issue here.

SuRGeoNix commented 3 years ago

OK the problem with the Video Acceleration on and the wrong row of frames happens with both vlc and potplayer as well. I will attach and sample video for testing in the future maybe.

https://user-images.githubusercontent.com/57474895/126083233-415f3231-14f6-4570-bc1a-fc94a7d5633a.mp4

pubpy2015 commented 3 years ago

May be when you access to the camera, there is problem with internet connection. I checked. Please try again.

VLC is not realtime player because default network caching = 1000 ms. You can change to 300 or 500 ms and change rtsp transport to TCP (default UDP)

ảnh

ảnh

SuRGeoNix commented 3 years ago

You can play around with ffmpeg format context flags / options etc. I've managed to stream with 2 seconds difference (from my clock) with the following options:

fmtCtx->flags |= AVFMT_FLAG_NOBUFFER;
fmtCtx->max_delay = 0; // Probably only for udp no reason just try
//fmtCtx->flags |= AVFMT_FLAG_DISCARD_CORRUPT // Try remove also this one and see how it goes

I'm not sure what else to look with this one, so I will close this for now and probably I will try to expose those flags / options as well in config.

SuRGeoNix commented 3 years ago

Ok found it. Use UDP instead of TCP and use Video Acceleration as well! Works great with 1 second difference... Tested also with ffplay which has same behavior.

config.demuxer.VideoFormatOpt["rtsp_transport"] = "udp";
SuRGeoNix commented 3 years ago

Further testing with my cameras proves that H265 (hevc) should be better choice and more stable for tcp with VA on.

  1. Ensure that your bitrates are based on your lines upload.
  2. Ensure that your cameras datetime is up to date (ideally with ntp sync)
pubpy2015 commented 3 years ago

Now realtime!

https://user-images.githubusercontent.com/59004953/127018507-45e441ec-02e0-41e3-86c7-9967bb0d98b1.mp4

All modifications in Screamer(): private void Screamer() { int vDistanceMs; int aDistanceMs; int sDistanceMs; int sleepMs; int actualFps = 0; long totalBytes = 0; long videoBytes = 0; long audioBytes = 0;

        bool    requiresBuffering = true;
        if (requiresBuffering)
        {
            totalBytes = decoder.VideoDemuxer.TotalBytes + decoder.AudioDemuxer.TotalBytes + decoder.SubtitlesDemuxer.TotalBytes;
            videoBytes = decoder.VideoDemuxer.VideoBytes + decoder.AudioDemuxer.VideoBytes + decoder.SubtitlesDemuxer.VideoBytes;
            audioBytes = decoder.VideoDemuxer.AudioBytes + decoder.AudioDemuxer.AudioBytes + decoder.SubtitlesDemuxer.AudioBytes;

            MediaBuffer();
            requiresBuffering = false;
            //if (seeks.Count != 0) continue;
            //if (vFrame == null) { Log("MediaBuffer() no video frame"); break; }
        }

        while (Status == Status.Playing)
        {
            //if (seeks.TryPop(out SeekData seekData))
            //{
            //    seeks.Clear();
            //    requiresBuffering = true;
            //    requiresResync = true;
            //    if (decoder.Seek(seekData.ms, seekData.foreward) < 0)
            //        Log("[SCREAMER] Seek failed");
            //}

            //if (requiresBuffering)
            //{
            //    totalBytes = decoder.VideoDemuxer.TotalBytes + decoder.AudioDemuxer.TotalBytes + decoder.SubtitlesDemuxer.TotalBytes;
            //    videoBytes = decoder.VideoDemuxer.VideoBytes + decoder.AudioDemuxer.VideoBytes + decoder.SubtitlesDemuxer.VideoBytes;
            //    audioBytes = decoder.VideoDemuxer.AudioBytes + decoder.AudioDemuxer.AudioBytes + decoder.SubtitlesDemuxer.AudioBytes;

            //    MediaBuffer();
            //    requiresBuffering = false;
            //    if (seeks.Count != 0) continue;
            //    if (vFrame == null) { Log("MediaBuffer() no video frame"); break; }
            //}

            //if (vFrame == null)
            //{
            //    if (decoder.VideoDecoder.Status == MediaFramework.MediaDecoder.Status.Ended)
            //    {
            //        Status = Status.Ended;
            //        Session.SetCurTime(videoStartTicks + (DateTime.UtcNow.Ticks - startedAtTicks));
            //    }
            //    if (Status != Status.Playing) break;

            //    Log("[SCREAMER] No video frames");
            //    requiresBuffering = true;
            //    continue;
            //}

            if (Status != Status.Playing) break;

            if (decoder.VideoDecoder.Frames.Count >= 1)
            {
                if (aFrame == null) decoder.AudioDecoder.Frames.TryDequeue(out aFrame);
                if (sFrame == null) decoder.SubtitlesDecoder.Frames.TryDequeue(out sFrame);

                elapsedTicks = videoStartTicks + (DateTime.UtcNow.Ticks - startedAtTicks);
                vDistanceMs = (int)(((vFrame.timestamp) - elapsedTicks) / 10000);
                aDistanceMs = aFrame != null ? (int)((aFrame.timestamp - elapsedTicks) / 10000) : Int32.MaxValue;
                sDistanceMs = sFrame != null ? (int)((sFrame.timestamp - elapsedTicks) / 10000) : Int32.MaxValue;
                sleepMs = Math.Min(vDistanceMs, aDistanceMs) - 1;

                //if (sleepMs < 0) sleepMs = 0;
                //if (sleepMs > 2)
                //{
                //    if (sleepMs > 1000)
                //    {   // It will not allowed uncommon formats with slow frame rates to play (maybe check if fps = 1? means dynamic fps?)
                //        Log("[SCREAMER] Restarting ... (HLS?) | + " + Utils.TicksToTime(sleepMs * (long)10000));
                //        VideoDecoder.DisposeFrame(vFrame); vFrame = null; aFrame = null;
                //        Thread.Sleep(10);
                //        MediaBuffer();
                //        continue; 
                //    }

                //    // Informs the application with CurTime when the second changes
                //    if ((int)(Session.CurTime / 10000000) != (int)(elapsedTicks / 10000000))
                //    {
                //        TBR = (decoder.VideoDemuxer.TotalBytes + decoder.AudioDemuxer.TotalBytes + decoder.SubtitlesDemuxer.TotalBytes - totalBytes) * 8 / 1000.0;
                //        VBR = (decoder.VideoDemuxer.VideoBytes + decoder.AudioDemuxer.VideoBytes + decoder.SubtitlesDemuxer.VideoBytes - videoBytes) * 8 / 1000.0;
                //        ABR = (decoder.VideoDemuxer.AudioBytes + decoder.AudioDemuxer.AudioBytes + decoder.SubtitlesDemuxer.AudioBytes - audioBytes) * 8 / 1000.0;
                //        totalBytes = decoder.VideoDemuxer.TotalBytes + decoder.AudioDemuxer.TotalBytes + decoder.SubtitlesDemuxer.TotalBytes;
                //        videoBytes = decoder.VideoDemuxer.VideoBytes + decoder.AudioDemuxer.VideoBytes + decoder.SubtitlesDemuxer.VideoBytes;
                //        audioBytes = decoder.VideoDemuxer.AudioBytes + decoder.AudioDemuxer.AudioBytes + decoder.SubtitlesDemuxer.AudioBytes;

                //        FPS = actualFps;
                //        actualFps = 0;

                //        //Log($"Total bytes: {TBR}");
                //        //Log($"Video bytes: {VBR}");
                //        //Log($"Audio bytes: {ABR}");
                //        //Log($"Current FPS: {FPS}");

                //        Session.SetCurTime(elapsedTicks);
                //    }

                //    Thread.Sleep(sleepMs);
                //}

                if (Math.Abs(vDistanceMs - sleepMs) <= 2)
                {
                    //Log($"[V] Presenting {Utils.TicksToTime(vFrame.timestamp)}");

                    if (renderer.PresentFrame(vFrame)) actualFps++; else DroppedFrames++;
                    decoder.VideoDecoder.Frames.TryDequeue(out vFrame);
                }
                else if (vDistanceMs < -2)
                {
                    DroppedFrames++;
                    VideoDecoder.DisposeFrame(vFrame);
                    decoder.VideoDecoder.Frames.TryDequeue(out vFrame);
                    Log($"vDistanceMs 2 |-> {vDistanceMs}");
                }

                if (aFrame != null) // Should use different thread for better accurancy (renderer might delay it on high fps) | also on high offset we will have silence between samples
                {
                    if (Math.Abs(aDistanceMs - sleepMs) <= 10)
                    {
                        //Log($"[A] Presenting {Utils.TicksToTime(aFrame.timestamp)}");
                        audioPlayer?.FrameClbk(aFrame.audioData);
                        decoder.AudioDecoder.Frames.TryDequeue(out aFrame);
                    }
                    else if (aDistanceMs < -10) // Will be transfered back to decoder to drop invalid timestamps
                    {
                        Log("-=-=-=-=-=-=");
                        for (int i = 0; i < 25; i++)
                        {
                            Log($"aDistanceMs 2 |-> {aDistanceMs}");
                            decoder.AudioDecoder.Frames.TryDequeue(out aFrame);
                            aDistanceMs = aFrame != null ? (int)((aFrame.timestamp - elapsedTicks) / 10000) : Int32.MaxValue;
                            if (aDistanceMs > -7) break;
                        }
                    }
                }

                if (sFramePrev != null)
                    if (elapsedTicks - sFramePrev.timestamp > (long)sFramePrev.duration * 10000) { Session.SubsText = null; sFramePrev = null; }

                if (sFrame != null)
                {
                    if (Math.Abs(sDistanceMs - sleepMs) < 30)
                    {
                        Session.SubsText = sFrame.text;
                        sFramePrev = sFrame;
                        decoder.SubtitlesDecoder.Frames.TryDequeue(out sFrame);
                    }
                    else if (sDistanceMs < -30)
                    {
                        if (sFrame.duration + sDistanceMs > 0)
                        {
                            Session.SubsText = sFrame.text;
                            sFramePrev = sFrame;
                            decoder.SubtitlesDecoder.Frames.TryDequeue(out sFrame);
                        }
                        else
                        {
                            Log($"sDistanceMs 2 |-> {sDistanceMs}");
                            decoder.SubtitlesDecoder.Frames.TryDequeue(out sFrame);
                        }
                    }
                }
            }
            else
            {
                Thread.Sleep(5);
            }
        }

        Log($"[SCREAMER] Finished -> {Utils.TicksToTime(Session.CurTime)}");
    }
SuRGeoNix commented 3 years ago

I was able to play with low latency without commenting all those lines... is there any actual reason to comment them?

pubpy2015 commented 3 years ago

May be with live stream as rtsp from ip camera, no need to compute and sleep Thread.Sleep (sleepMs) because you can't get the next frame if the frame has not been produced by the camera.

Or, time to open video input too long => there is an error when calculate timestamp => wrong sleep time. Setting: player.Config.demuxer.VideoFormatOpt.Add("probesize", "4096") can reduce delay time. avformat_find_stream_info slow: https://www.programmersought.com/article/430134736/

SuRGeoNix commented 3 years ago

The library must exposes as much as ffmpeg's low level configuration to the user. That doesn't mean that it will be able to configure by itself :( So, I'm afraid that anyone who uses the library needs to do some research on ffmpeg's options / flags etc. I will have a look if it's possible to expose also options for avformat_find_stream_info but it's important that the library will be able to open any input not just rtsp live cameras. I'm doing some research for network streams lately and I will have some enhancements soon.

SuRGeoNix commented 3 years ago

I've just added config.player.LowLatency which is focuses on cases like this one. Check it out and let me know.

pubpy2015 commented 3 years ago

Now it works fine:

https://user-images.githubusercontent.com/59004953/127678365-7c1a3ebc-9622-43d7-8a3c-22c0a500984b.mp4