popcornmix / omxplayer

omxplayer
GNU General Public License v2.0
1.01k stars 334 forks source link

Clock drift - falling behind UDP broadcast #55

Closed perrypoint closed 10 years ago

perrypoint commented 10 years ago

Omxplayer doesn't handle clock drift or clock timing differences playing a UDP stream. I play a UDP multicast stream for a very long time (a day or more) and observe that while initially it starts playing about 10 seconds behind 'live' (which is annoying latency) but it also falls farther and farther behind - after a day or so of playing the stream it is typically a minute or more behind 'live'. The code that looks like it is supposed to adjust clock to use up the excessive buffer isn't working.

jmeiners commented 10 years ago

You can reduce the latency on the live stream by telling FFMPEG not to buffer it:

FFMPEG is buffering all the data it uses for stream format analysis. Just have to add the flag below to correct (when opening the stream).

// discards the frames buffered during stream analysis m_pFormatContext->flags |= AVFMT_FLAG_NOBUFFER;

Then also set a low --threshold so it won't wait for the buffers to fill up.

jmeiners commented 10 years ago

I agree the clock adjustment isn't working right. The good news is that the HW mechanism seem to be in place. Below is excerpts from my conversion with popcornmix on the subject.


(popcornmix) HDMI clock is changed with a PLL (by < 0.1%) . HDMI audio comes from same clock so will also scale. .. We vary the hdmi clock to ensure frame presentation times are midway between vsyncs. We may flip the presentation time by half a vsync initially to avoid slippage, but will gradually remove that (through the hdmi clock).

The numbers in the latency target just determine how quickly that syncs, but I wouldn't recommend changing it.

(me) So the control loop is basically

1.  Phase error is calculated from ((next_VS_PTS + last_VS_PTS)/2 - framePTS)     => 0 at VS/2, max near either VS
2.   phase_error filter via coefficient in struct to yield PLL update
3.  PLL adjusts video (and hence audio) clock to effectively achieve lock to the encoded video stream

Is there anyway to monitor the lock status of the above? The latency struct values you are using are different from the recommended ones in the documentation. Have they been verified?

The only drawback to this scheme is it does nothing to actually control the latency through the decoder. It achieves rate control, but not buffer level control. The actual buffer levels could arrive at any random (steady) state. This is in fact what I observe as well.

A better scheme would be to use a combination of the phase error calc in #1 above and the PTS difference from OMXReader to the renderer output. That would achieve decoder=encoder clock locking, but also guarantee that the buffering (and hence latency) remains at optimal levels (which you could configure).

I believe changing to this would be a firmware change though.

popcornmix commented 10 years ago

I don't think the hdmi clock variation can fix both issues. If your vsync phase is drifting such that you need to decrease HDMI clock to keep your output frames and your gpu fifo is growing, requiring an increase in speed...

There is also another control in audio_render (OMX_IndexConfigAudioRenderingLatency) which I believe controls latency by resampling audio which again may be a solution (if the audio is not passthrough).

jmeiners commented 10 years ago

Adjusting the hdmi clock should be able to resolve both if the proper phase error term is given to the control loop input. This is how all real time systems are locked. The objective is to adjust the output clock to make it equal to the original input clock at the encoder. Once that is achieved, it is guaranteed that the fifo (buffering) in between will stay constant over time. Making sure that the output vsync is at a certain phase relationship is just an offset in the equation. Making sure that the fifo latency is constant is forced by using that as the error term.

Do you have any contact with the OMX firmware developers on this? We probably need to work with them to understand what is implemented and get any changes implemented.

I would love to help on this and have implemented dozens of these systems (real time streamers) before.

BTW, the AudioRenderingLatency is just a feedback of how many samples are in queue inside the renderer (query only).

jmeiners commented 10 years ago

Another observation: IndexConfigLatencyTarget in the omx clock component is described as: Query / set the filter values used when tracking offset between media time and reference clock source by applying small changes to media time speed.

IndexConfigLatencyTarget in the video renderer clock component is described as: Query / set the filter values used when tracking phase offset between presentation and vsync by changing HDMI pixel output frequency.

IndexConfigLatencyTarget in the audio renderer clock component is described as: Query / set the filter values used when tracking audio rendering latency by changing the clock speed.

This sounds like 3 different algorithms that can be used??

perrypoint commented 10 years ago

If the pi is forced to 1080i output is it possible to detect which field is being output on the HDMI out and sync the AV clock to get the correct field order? The display device would do a much better job deinterlacing than the pi is capable of doing.

jmeiners commented 10 years ago

I think the only way that would work is if the pi never skip/repeats a frame. If we solve that problem then this is also solved.

The PTS on each field would never allow them to be shown out of order.

popcornmix commented 10 years ago

Firstly, omxplayer is designed for file/network playback, where large buffers are desirable to allow for variable I/O latency. Low latency playback has never been a design goal.

However I can consider this, so feel free to try to convince me of what the behaviour should be. I am able to modify the GPU code if the changes suggested are unobtrusive and fit in with the structure of the code.

Do you have any evidence of frames being skipped/repeated with -r option? I've spent quite some time with test streams like: http://www.avforums.com/forums/streamers-network-media-players/1175436-files-judder-test.html

and don't believe we skip/repeat frames when the file has good timestamps. It's not uncommon to see files with high jitter on their timestamps, or glitches (possibly from cutting) which can throw off the framerate locking code.

If you have examples file where frame skips/repeats are observable, and the timestamps are good, then that is a bug that can be investigated.

Whilst I could imagine varying the hdmi clock to increase or reduce amount of buffered data, it seems at the small amounts we are allowed to vary the hdmi clock (0.1%) it will take some time to dispose of a substantial buffer (e.g. hundreds of milliseconds). It would obvioiusly be better to start with the correct buffer size when the clock starts, and rely on framerate locking to keep that buffer controlled.

There is the --threshold option which determines the amount of time in the GPU fifo before the clocks are started. That starts at 200ms, but doubles each time we underrun. There is also a "OMX_PRE_ROLL" setting in the clock component which is set to 200ms. This means the clock effectively starts at -200ms which gives some time for data to propagate through omx components so it is available at the render components when time reached 0.

So, I could imagine 400ms of latency in the GPU, but that can be reduced from the omxplayer code (at the risk of underrunning).

popcornmix commented 10 years ago

To see some debug info on the hdmi frame locking, add "enable_hdmi_status=1" to config.txt and reboot (This costs some memory on GPU). Run "omxplayer -r" and press 'z'.

"hdmi match" shows how closely the frame timestamps and vsyncs coincide. When the match is below the green marker, the hdmi clock matching is disabled. "hdmi period" shows the frame period calculated (blue bar) compared to vsync period (green marker) "hdmi phase" shows the phase the frame is being presented (blue bar) compared to vsync. The PLL aims to keep it between 45% and 50% of the vsync interval. "hdmi clock" shows the current hdmi clock frequency (blue) against the nominal frequency for that hdmi mode. The range is +/- 0.2%.

jmeiners commented 10 years ago

Thanks for the detailed feedback :)

I'll try to respond to all these points:

  1. I agree that low latency playback is only desired in certain situations (real time video on a local network). However, this is pretty common for IP cameras and such so should be considered an important use case.
  2. Having an algorithm that locks to a constant buffer is desirable for all cases (network streaming & local streaming). Not only does this lock the clocks/refresh rates, but it also insures that you have optimal buffering to recover from packet loss. In network streaming, the target buffer size would be large. It would be smaller when low latency is needed (but this mean packet loss/jitter should be less, ie lan).
  3. My "ideal" algorithm would be (enabled on any LIVE stream):
    All buffering of video/audio done via the GPU (no "queue" before omx) GPU starts output when it's total pipe latency (input timestamp - audio out timestamp) = target buffer/latency GPU increases/decreases the HDMI clock based on the real time error (current_latency- target_latency) If a major event causes the current_latency to jump (timestamp error/edit/ buffer underflow/etc), system will resynch to the proper buffer level and restart. When total_pipe_latency - Target is below a threshold (locked condition), fine adjustment of the HDMI clock is done by the current VSYNC method.

Now for the testing on current version:

Thanks for the status info. I don't believe I am seeing skip/repeat happening. However, with access to the GPU it should be easy to add that to the status output. Certainly better than trying to watch for it :) What I am seeing is that when watching a local live stream for a long time, the overall latency changes (at a slow constant rate). This shouldn't happen if the clock are locked.

Watching the "z" output, I see:
   HDMI match toggles between all blue and all red
   HDMI period is generally on the green target, with slight noise in either direction
   HDMI phase:  The blue is always to the right of the green line, with variation in amount
   HDMI clock is generally on the green target, with slight noise in either direction (noise may have a slight right bias)
jmeiners commented 10 years ago

I have tried several different streams: RTSP mpeg4 camera mpeg2 hdhomerun cable channel h.264 local broadcast channel The status outputs are different on all of them, but the phase is never constant nor near the green. Also, the hdmi_clock is inconsistant and jumpy. Without having the units or underlying algorithm, it is hard to say what that means. However, it looks unlocked to me....

popcornmix commented 10 years ago

If hdmi match is falling below the green marker, then the presentation timestamps were not coherent. We process blocks of 32 timestamps, and work out how correlated they are (i.e.if they fit the line timestamp(n) = alpha + beta * n then the match will be 100%). Can you dump the timestamps you are receiving? This is a little tricky as they may be reordered in decoder, but as a first check, just sort the pts values, and plot them and confirm there is no jitter.

Are you getting underrun? (i.e. a pause with audio or video fifo at zero?) that may mess with the match.

jmeiners commented 10 years ago

Are you using the DTS or PTS stamps (this is mpeg2 ts)?

Definitely no underrun on these.

popcornmix commented 10 years ago

I use the PTS if available.

jmeiners commented 10 years ago

Ok, I plotting them and they are a perfectly clean diag line except for a single jump at the start (which is caused by packet loss when the cpu maxes out setting up the pipe, but that is a different issue...)

The dts and pts values are identical in this stream (output from FFMPEG anyway). They are also exactly 30 fps assuming the TS values are microseconds. These are the packets feed to the video decoder (not the audio packets).

popcornmix commented 10 years ago

And the hdmi match is not consistently good? Can you play the same packets from a file? Is the hdmi match also not consistently good?

and can you confirm you are running with "-r" and the hdmi mode switches to 30 (or 60) Hz?

jmeiners commented 10 years ago

I'm using -s -r -g --threshold 0.01 udp://@192.168.1.128:5006?overrun_nonfatal=1

The monitor is running at 60hz

hdmi match is definitely toggling between good/bad. Is there anything in the omx settings that affects this?

I wasn't able to play that exact stream from file yet, but playing a DVD mkv is see similar results. Match is good 80% of the time and all red 20%.

popcornmix commented 10 years ago

I've just tried a 2 minute mkv file, and apart from a red blip at very start, it was solid almost all blue for the remainder. Can you try a few more files, and see if the toggling is file specific, or related to something else.

(also have you ran rpi-update lately? There was an improvement to this code a few weeks back).

jmeiners commented 10 years ago

I thought my firmware was up to date, but I guess not. Just updated and it made a huge difference :)

Now the stats do show stability after about 1 min locking time. hdmi-match = constant blue hdmi-peried = at the green line rock solid hdmi phase = blue to the left of green but now stable. Is this the noffset setting (making it left of green)? hdmi clock = blue is way left of green but totally stable. What are the units here? Does the bar represent the complete PLL adjustment range?

I'm going to leave it playing and see if there is still any drift :)

popcornmix commented 10 years ago

"hdmi phase" aims for between 45%-50% (it's biased slightly early, as we may be a little late presenting when busy). If it is in that range, or moving towards that range, then clock is left alone. Outside of that range and moving away, then clock is adjusted. (noffset not involved). "hdmi clock" The range is +/- 0.2% (I believe the HDMI spec requirement. I think we clamp it to 0.1% to play it safe).

jmeiners commented 10 years ago

What is the hdmi clock range represented by the bar ("z" output)? I ask because it is nearly all the way to the right, so it makes me think it might just be maxed out?

It is still drifting, although it is more in bursts now then steadily. I also see the difference between the OMX_wall_clock and the OMX_media_time changes sometimes. I would think this only could happen if the playspeed changed (which isn't of course). However, I now wonder if the "latency" control in the OMX_CLOCK component is causing this? Triggering a change in the media time speed based on some event? Do you know how that works?

doc from CLOCK component: Query / set the filter values used when tracking offset between media time and reference clock source by applying small changes to media time speed.

popcornmix commented 10 years ago

The media time will follow the timestamps of the audio samples being played. So, yes media time can be non-linear if the timestamps don't match the samples

         // reset our media clock with this reference stream.
         // If the difference between the reference time and the media time is greater than 50ms,
         // then just set the media time to the reference time.
         // Otherwise, we want to avoid unnecessary clock jitter, so we do a weighted moving
         // average of the difference between the reference time and the media time.
         // If this is greater than 8ms, we nudge the media time by this average.
         // For differences smaller than that, we're not going to change which frame display
         // we use at 60fps, so don't fiddle with the clock.

This behaviour can be changed with OMX_IndexConfigTimeActiveRefClock.

jmeiners commented 10 years ago

So if our stream drops audio packets (it is udp..), the media time will jump to the next available packet instead of muting audio until the media time reaches the next valid packet.

If I don't use audio as reference, then do I get the later behaviour?

popcornmix commented 10 years ago

I believe so.

jmeiners commented 10 years ago

Ok, I'll play with it some more.

What is the hdmi clock range represented by the bar ("z" output)? I ask because it is nearly all the way to the right, so it makes me think it might just be maxed out?

popcornmix commented 10 years ago

hdmi clock is only updates when it changes, so may be showing a stale value if it's not moving. All the way to the right would be nominal freq + 0.2% (but I've never seen more than about 0.05% off nominal).

jmeiners commented 10 years ago

Are you testing it using HDMI audio or local audio? I am using local, and wonder if the local output time base does not adjust like the hdmi....

popcornmix commented 10 years ago

The scheme we've been discussing is purely designed for HDMI framerate sync. The hdmi clock is changed, but that won't affect analogue audio.

There are other schemes in the code that may resample the audio to maintain audio timing, but I'm not so familiar with those.

I think getting the desired behaviour from HDMI audio first would be best, as I think analogue audio may involve an extra step.

jmeiners commented 10 years ago

Agreed.

BTW, does xbmc use a different OMX pipeline to get simultaneous HDMI & analogue (or another method)?

popcornmix commented 10 years ago

I've added a commit that adds a "--live" option which should be enabled for fixed latency playback (e.g. vod/TV). It monitors the amount of audio buffered by gpu, and adjusts the clock speed (which will resample audio) to maintain this. By default the latency is 700ms, but this can be changed with --threshold.

jmeiners commented 10 years ago

Nice work!!!

Any thoughts on using this in conjunction with the -r mode (HDMI clock sync)?

Looking at the code you have for the OMX clock no change as long as buffer is within 10% of target +- 0.1% if within 50% else +- 1%

And we know that the HDMI clock sync will also adjust the output clock (when enabled) by up to +-1% max. Since in the live mode the audio is no longer the OMX source, will the OMX clock still be coupled to the HDMI clock?

If so, I think they should work together fine in steady state....

popcornmix commented 10 years ago

HDMI clock syncing only varies by up to 0.2% (and typically it's more like 0.02% if the timestamps are sane). Audio is still the clock master (as in media time is derived from audio samples played) Resampling audio to change playout speed is an independent control to hdmi clock, so can compensate.

So, I think they should work fine together.

The only ugly bit is live + pasthrough. We can't resample passthrough audio, so we drop/duplicate packets to maintain speed. I'm not sure how good that will sound.

But this is a first pass at solving this issue, please test and report what works and what doesn't.

cyryllo commented 10 years ago

For me when I add "--live" won't boot image with the camera :/ omxplayer --live --threshold 60 rtsp://..........

popcornmix commented 10 years ago

I don't expect 60ms threshold to work. Without the threshold parameter what happens?

Does this work: omxplayer --live rtsp://dw.edge.wowza.gl-systemhaus.de/liveedge/dw_w_tv-europa_l

Can you run (with your rtsp url) with -g and post omxplayer.log to pastebin.

cyryllo commented 10 years ago

I thought that the value of the --threshold are in seconds, not milliseconds. http://pastebin.com/YLPVXbH7

popcornmix commented 10 years ago

Oh, your stream has no audio. That is not something I've tested. If someone has a sample rtsp url to a video only stream I can do some testing.

cyryllo commented 10 years ago

Unfortunately I do not have public access to the camera by rtsp. I can only give you a link rtmp://kamera.task.gda.pl/rtplive/k002.sdp

popcornmix commented 10 years ago

I've committed a plausible fix for video only streams (tested by ignoring the audio stream from existing video+audio stream). Can you test?

cyryllo commented 10 years ago

It Works :) Now test the streaming is not interrupted and there is no artifacts. Thank you for your help.

cyryllo commented 10 years ago

I found one more problem. By using the '--threshold' the image also does not start. omxplayer -g --live --threshold 1000 rtsp://......./axis-media/media.amp http://pastebin.com/1zSBpQ0T

Use the same --live works very well. So far I have not noticed problems.

radojko commented 10 years ago

It seems problem with stream stoping still exist, after about 1 hour stream stops, here is stream url for testing: rtsp://dw.edge.wowza.gl-systemhaus.de/liveedge/dw_w_tv-europa_l Latency/sync during packets drops is improved, this is tested in congested network conditions with lot of frame drops. I didn't tested with threshold opton.

popcornmix commented 10 years ago

@radojko I've been using that rtsp stream for testing, and it has not stopped after an hour for me (although I get no packet loss). Does it also stop when network conditions are good? I wonder if the errors are causing the socket to be closed, which is something we can't help. Can you run with "-g" and pastebin omxplayer.log next time it fails.

radojko commented 10 years ago

Stream stopped in "normal" conditions after one hour several times with both this public stream and my in LAN. Next time I'll enable omxplayer loging. Btw, it should be good to add auto reconnect option in omxplayer, now on network disconnection and connection we dont have stream again on the screen, also there is no error in console to catch this event and reconnect if network fails with some external script( the reason is maybe high default OS tcp timeouts- it should be checked )

radojko commented 10 years ago

After OS reinstallation everything works properly, no more problems with stream stopping after 3 hours with internet stream.

popcornmix commented 10 years ago

@perrypoint Have you tested this? Okay to close?

perrypoint commented 10 years ago

yeah, if I see it again I'll let you know!

cyryllo commented 10 years ago

After about 2 days of artifacts appear again. I have yet to see how many hours the problem comes back. I guess I'm left to restart once a day