Feature Request: Video and Audio synchronized

alatteri commented 1 year ago

Hello,

Getting reliable V/A synchronization with UG can be a challenge. It can drift from minute to minute in fact. It requires a lot of trial and error in playing with delay settings, and then on one run when you think you have found it, the next run it is out of sync again. Sometimes these are small values, that might be hard for the normal person to detect, and other times it is quite off. I believe this stems from the fact that in UG, V and A are completely separate pipelines, even being transmitted separately. So the request is to add some mechanism/timestamp for V & A so that even if they are captured and transmitted separately, the receiver will inherently synchronize them, or just replace the underlying transport mechanism with something like MPEG-TS that has V/A packetized together.

Let's discuss.

Thanks, Alan

TheSashmo commented 1 year ago

I agree Alan, thats why TS is already preferred when doing something like this. Amongst other things, this is one of the reasons the TS exist. But that being said, I am sure there is a way to sync (yes at the mercy of latency) the audio and video without having to make a TS for it. What I'd like to know is why some codecs are quite stable, and others are not. I can understand about encoding time and latency, but if you examine OPUS it covers wide range of frequencies and acceptable latency, but its not consistent. Ref: https://capture.dropbox.com/Jb2vzPUIXeYaHc62 PCM for example which is the raw audio out of the SDI, its logically the closest as its not being compressed additionally, but as you can see drift fix has to be on to keep it accurate (its also 8mbps worth of audio in my test setup), but on the other side A-Law (not wide band and can't be used for my application) is spot on, but drift fix is off.

I am fine with sticking with one audio codec, I could care less on which it is, as long as it's wide band and sync'd and stable. My concern is even on a localhost that its not stable. I can live with variances once it goes out to the network, but I can't even get a stable encode / decode on a local box. I am totally fine with manually adjusting if I have to at the decoder side, but as I mentioned in my other post #219 I would be happy if I could at least be able to control it with some sort of reliability, but I can't even do that.

I can make a test setup available if that is helpful. The lip sync analyzers are not cheap $20k each, so I could help with giving remote access to that.

MartinPulec commented 1 year ago

We've briefly discussed this request and maybe we can partially implement that. If I understood it correctly, for both of you it is interesting DeckLink->DeckLink (having BMD on both sides) transmission, right?

Since there is still the scheduled playback mode (no-low-latency), it can be used for that. Anyways, it will be almost certainly incompatible with _driftfix. It depends whether the receiving clock is slower, in which case it won't be so much problem, except that the latency will continually increase. If the opposite, it will be worse. Depending on the use-case, setting bmdDeckLinkConfigClockTimingAdjustment might be a better option, anyways. Of course if possible (device supports that and the output doesn't need to be ref-locked).

alatteri commented 1 year ago

Hi Martin,

I use both Decklink->Decklink and Decklink->Vulkan_sdl

MartinPulec commented 1 year ago

Sure, but I am afraid there is not much possible to be done with SW displays – audio and video frame can be presented at (approximately) the same time, which is how it is done now.

MartinPulec commented 1 year ago

Hi, I've recently merged the synchronous output for DeckLink, it is documented here. I've also removed the older no-low-latency mode which has evolved to this mode.

There are some limitations mentioned in the linked wiki, namely it will work only decklink->decklink (or testcard vidcap as a source). It would be feasible to implement also for other devices if requested. Also I am not sure how it will work in case of clock drift between DeckLinks (I also don't know how it will play with :drift_fix, but it can be tested, it isn't forbidden).

Feel free to test - I've only tested in very basic setup... It is possible that there appear problems, but it is a bit tricky for me to build up some more real-world setup.

alatteri commented 1 year ago

Hi Martin... can you please clarify what[=p[,b]] does?

alatteri commented 1 year ago

Would it be possible to implement for vulkan display/alsa audio? That suffers from sync issues too.

It would be feasible to implement also for other devices if requested.

MartinPulec commented 1 year ago

it is basically number of buffered frames - Preroll and Buffer size (the number off frames that get stored before dropping). It is documented in -t decklink:fullhelp.

MartinPulec commented 1 year ago

Would it be possible to implement for vulkan display/alsa audio? That suffers from sync issues too.

Well, ALSA and Vulkan cannot be explicitly synchronized. I can play approximately at the same time, which is how it is done so far. Can you be more specific how much desync are you experiencing now in number milliseconds? If it is within one video frame time, I'd say that it is ok. If more, it is rather a question why - whether frame-level video compression/decompression isn't used etc.

It would be feasible to implement also for other devices if requested.

I meant rather displays (or capturers) like AJA and similar, where audio is bound to video and I can tell the device when to play the audio/video.

TheSashmo commented 1 year ago

Thanks Martin I will test it.

Wow..... 100ms, thats a lot! Is there no other way to keep that lower?

alatteri commented 1 year ago

If this is not compatible with drift_fix, does it mean it will suffer from audio hits/drops, or does the way this work inherently prevent that?

MartinPulec commented 1 year ago

Wow..... 100ms, thats a lot! Is there no other way to keep that lower?

I don't know exact amount, I don't have equipment to test the latency easily. So it can be lower or higher... You can also fiddle with the buffer parameters but if you restrict it too much, it can be at the expense of stability.

does it mean it will suffer from audio hits/drops, or does the way this work inherently prevent that?

I cannot say - you'll need to test. It needs to be said that overflows/underflows aren't inherently caused by UltraGrid but by drifting clocks (UG just doesn't care). As for this, I also cannot fully test this because I cannot reasonably reproduce the drift(well, 8K Pro AFAIK supports clock adjustment, which could help testing, I haven't tried yet). It is also possible that _driftfix would normally work but then I cannot guarantee and I wouldn't recommend without knowing, that it really helps (which is true for _driftfix in general, otherwise it would be enabled by default).

TheSashmo commented 1 year ago

Whats the suggestion to keep prevent clocks from drift, reference?

MartinPulec commented 1 year ago

Whats the suggestion to keep prevent clocks from drift, reference?

In my opinion you have 3 options if it is a problem:

do nothing - although it seems strange, depending on the Use-Case it may not be a problem; it is hardly audible in a speech; can be a problem for a music, however
8K Pro (and maybe also some other newer models) can have the clock adjusted by BMDDeckLinkSupportsClockTimingAdjustment configuration option by -127–127 PPM, the appropriate option for DeckLink display would there be :ctad=<val>; you'd need to know the value anyways; UltraGrid doesn't auto-adjust it (but in theory there it could be determined from some UG run output)
_driftfix - if it works, it is OK to use; but as I've already mentioned, I cannot guarantee that it will work toghether with sync playback

MartinPulec commented 1 year ago

FYI, I've noticed some bad thing about the synchronized mode – it is currently incompatible with any audio compression (I've already updated the wiki page accordingly). It also needs --param incompatible on the sender now (well, it could have been enabled for PCM and disabled for compressed audio but I dislike doing it this way implicitly).

alatteri commented 1 year ago

Oh.. that does sound bad. Sending PCM is not ideal, but then neither is de-synchronization.

MartinPulec commented 1 year ago

Oh.. that does sound bad. Sending PCM is not ideal, but then neither is de-synchronization.

I'll try to fix it, although it is not completely trivial, I have an idea how to do that. A disadvantage is that it will break compatibility with older decoders, so that it will need to be enabled explicitly for some transition time.

TheSashmo commented 1 year ago

I can provide you a remote test environment to validate the accuracy. I have all the tools needed for up to 1080p at the moment. And can upgrade later for 4K.

MartinPulec commented 1 year ago

I'll try to fix it

Should be fixed now. As already noted, sender needs --param incompatible because sending properties needed to be slightly changed and UG versions prior that won't successfully decode audio encoded with Opus (the other codecs should work though).

alatteri commented 1 year ago

Great. I try to test this within the next few weeks. Thanks!

alatteri commented 1 year ago

I finally had time to test this, sorry for the long delay. I don't see that this actually provides benefit. Using a test 2-pop clip, I see the audio drifting, even just looping the 10 second clip a few times. In addition, even when excluding "synchronized", driftfix no longer seems to work. I also rolled back to a2d89135 on the receiver and still seeing overflow issues, so I wonder if Decklink 12.7 might have some regressions? This is all strange, as now Decklink on UG is nearly unusable.

UltraGrid 1.8+ (master rev 87e0c61 built Oct 2 2023 07:24:04)

encoder: uv -m 1316 -t decklink:codec=R12L --audio-filter delay:1:frames -c libavcodec:encoder=libx265:crf=22 -s embedded --audio-capture-format channels=8 --audio-codec=AAC:bitrate=256K --param incompatible

receiver: uv -d decklink synchronized --audio-delay -160 -r analog --audio-channel-map 0:0,1:1,2:2,3:3,4:4,5:5,6:6,7:7 --audio-scale none -P 5004 --param use-hw-accel,resampler=soxr,decoder-use-codec=R12L

I tried to figure out the syntax for synchronized[=opts] options, but I couldn't. I get a lot of Missing Frame errors, and without driftfix, the audio clicks. So unfortunately, this is actually kind of bad both ways.

[2023-10-03 11:36:15] SSRC 0xc4f21da6: 512/512 packets received (100.0000%), 0 lost, max loss 0
[2023-10-03 11:36:16] SSRC 0xbdf6d2ae: 1920/1920 packets received (100.0000%), 0 lost, max loss 0
[2023-10-03 11:36:17] [Decklink display] Missing frame
[2023-10-03 11:36:17] [Decklink display] Missing frame
[2023-10-03 11:36:17] [Decklink display] Missing frame
[2023-10-03 11:36:17] [Decklink display] Missing frame
[2023-10-03 11:36:17] [Decklink display] Missing frame
[2023-10-03 11:36:17] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] audio buffer underflow!
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] audio buffer underflow!
[2023-10-03 11:36:18] [lavc aac @ 0x7fe928552100] channel element 3.15 is not allocated
[2023-10-03 11:36:18] [lavcd aud.] error receiving decoded frame -Error while decoding frame (rc == -1094995529): Invalid data found when processing input.
[2023-10-03 11:36:18] [Audio decompress] 2 empty channel(s) returned!
[2023-10-03 11:36:18] [Decklink display] audio buffer underflow!
[2023-10-03 11:36:18] [lavc aac @ 0x7fe928033b80] Sample rate index in program config element does not match the sample rate index configured by the container.
[2023-10-03 11:36:18] [lavc aac @ 0x7fe928033b80] decode_pce: Input buffer exhausted before END element found
[2023-10-03 11:36:18] [lavcd aud.] error sending decoded frame -Error while decoding frame (rc == -1): Operation not permitted.
[2023-10-03 11:36:18] [Audio decompress] 3 empty channel(s) returned!
[2023-10-03 11:36:18] [lavc aac @ 0x7fe92856b040] skip_data_stream_element: Input buffer exhausted before END element found
[2023-10-03 11:36:18] [lavcd aud.] error sending decoded frame -Error while decoding frame (rc == -1094995529): Invalid data found when processing input.
[2023-10-03 11:36:18] [Audio decompress] 2 empty channel(s) returned!
[2023-10-03 11:36:18] [Decklink display] audio buffer underflow!
[2023-10-03 11:36:18] [Audio decompress] 2 empty channel(s) returned!
[2023-10-03 11:36:18] [lavc aac @ 0x7fe9280c71c0] skip_data_stream_element: Input buffer exhausted before END element found
[2023-10-03 11:36:18] [lavcd aud.] error sending decoded frame -Error while decoding frame (rc == -1094995529): Invalid data found when processing input.
[2023-10-03 11:36:18] [Audio decompress] 5 empty channel(s) returned!
[2023-10-03 11:36:18] [Decklink display] audio buffer underflow!
[2023-10-03 11:36:18] [Audio decompress] 7 empty channel(s) returned!
[2023-10-03 11:36:18] [Audio decompress] 1 empty channel(s) returned!
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] Missing frame
[2023-10-03 11:36:18] [Decklink display] 110 frames in 5.53713 seconds = 19.8659 FPS
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 5
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 6
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 7
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 5
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 6
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 7
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 8
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 9
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 10
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 11
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 12
[2023-10-03 11:36:18] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 5
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 6
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 7
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 8
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 9
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 10
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 11
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 12
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 13
[2023-10-03 11:36:18] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:18] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:18] [Decklink display] Dismissed frame, buffered: 5
[2023-10-03 11:36:19] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:19] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:19] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:19] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:19] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:19] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Audio decoder] Received 29876/29888 B (12 lost), decoded 212992 samples in 5.04 sec.
[2023-10-03 11:36:20] [Audio decoder] Volume: [0] -40.51/-19.86, [1] -40.51/-19.86, [2] -40.51/-19.86, [3] -40.51/-19.86, [4] -40.51/-19.86, [5] -40.51/-19.86, [6] -40.51/-19.86, [7] -40.51/-19.86 dBFS RMS/peak
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:20] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:21] SSRC 0xc4f21da6: 512/512 packets received (100.0000%), 0 lost, max loss 0
[2023-10-03 11:36:21] SSRC 0xbdf6d2ae: 1823/1920 packets received (94.9479%), 97 lost, max loss 26
[2023-10-03 11:36:21] Video dec stats (cumulative): 10418 total / 10383 disp / 23 drop / 0 corr / 12 miss
[2023-10-03 11:36:21] Audio dec stats (cumulative): 10386 played / 10436 total audio frames
[2023-10-03 11:36:21] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:21] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:21] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:21] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:22] [Decklink display] audio buffer overflow!
[2023-10-03 11:36:23] [Decklink display] Missing frame
[2023-10-03 11:36:23] [Decklink display] Dismissed frame, buffered: 5
[2023-10-03 11:36:23] [Decklink display] 143 frames in 5.00915 seconds = 28.5478 FPS
[2023-10-03 11:36:25] [Audio decoder] Received 18504/18504 B, decoded 241664 samples in 5.04 sec.
[2023-10-03 11:36:25] [Audio decoder] Volume: [0] -44.07/-19.95, [1] -44.07/-19.95, [2] -44.07/-19.95, [3] -44.07/-19.95, [4] -44.07/-19.95, [5] -44.07/-19.95, [6] -44.07/-19.95, [7] -44.07/-19.95 dBFS RMS/peak
[2023-10-03 11:36:26] SSRC 0xc4f21da6: 640/640 packets received (100.0000%), 0 lost, max loss 0

encoder: uv -m 1316 -t decklink:codec=R12L --audio-filter delay:1:frames -c libavcodec:encoder=libx265:crf=22 -s embedded --audio-capture-format channels=8 --audio-codec=AAC:bitrate=256K --param use-hw-accel

receiver: uv -d decklink:drift_fix --audio-delay -160 -r analog --audio-channel-map 0:0,1:1,2:2,3:3,4:4,5:5,6:6,7:7 --audio-scale none -P 5004 --param use-hw-accel,resampler=soxr,decoder-use-codec=R12L

2023-10-03 12:05:23] [Audio decoder] Volume: [0] -21.99/-7.29, [1] -22.16/-7.53, [2] -29.50/-11.58, [3] -32.87/-18.19, [4] -29.54/-15.43, [5] -28.64/-15.04, [6] -22.97/-6.98, [7] -22.59/-6.41 dBFS RMS/peak
[2023-10-03 12:05:23] [Decklink display] 121 frames in 5.02166 seconds = 24.0956 FPS
[2023-10-03 12:05:24] SSRC 0x5fe6fab6: 2176/2176 packets received (100.0000%), 0 lost, max loss 0
[2023-10-03 12:05:26] [Decklink display] audio buffer underflow!
[2023-10-03 12:05:26] [Decklink display] audio buffer underflow!
[2023-10-03 12:05:26] [Decklink display] audio buffer overflow!
[2023-10-03 12:05:26] [Decklink display] audio buffer overflow!
[2023-10-03 12:05:26] [Decklink display] audio buffer overflow!
[2023-10-03 12:05:26] [Decklink display] audio buffer overflow!
[2023-10-03 12:05:26] [Decklink display] audio buffer overflow!
[2023-10-03 12:05:26] [Decklink display] audio buffer overflow!
[2023-10-03 12:05:26] SSRC 0x84c147c8: 1920/1920 packets received (100.0000%), 0 lost, max loss 0
[2023-10-03 12:05:28] [Audio decoder] Received 737949/737949 B, decoded 233326 samples in 5.03 sec.
[2023-10-03 12:05:28] [Audio decoder] Volume: [0] -24.22/-8.98, [1] -24.27/-8.84, [2] -25.02/-7.40, [3

MartinPulec commented 1 year ago

Hi Alan,

I also rolled back to a2d8913 on the receiver and still seeing overflow issues, so I wonder if Decklink 12.7 might have some regressions? This is all strange, as now Decklink on UG is nearly unusable.

Do you have any steps to reproduce? I've just tried with 4K Extreme, BMD 12.7 on U22.04 following command:

UltraGrid-continuous-x86_64.AppImage -s embedded -t testcard -r embedded -d decklink

and it just works as before. Please create eventually a separate issue for that.

encoder: uv -m 1316 -t decklink:codec=R12L --audio-filter delay:1:frames -c libavcodec:encoder=libx265:crf=22 -s embedded --audio-capture-format channels=8 --audio-codec=AAC:bitrate=256K --param incompatible

receiver: uv -d decklink synchronized --audio-delay -160 -r analog --audio-channel-map 0:0,1:1,2:2,3:3,4:4,5:5,6:6,7:7 --audio-scale none -P 5004 --param use-hw-accel,resampler=soxr,decoder-use-codec=R12L

I wouldn't expect that this could work. What would you expect from --audio-delay -160 and synchronized audio? Also --audio-filter delay:1:frames doesn't make any sense to me. I don't know if the filter even passes the timestamp - possibly not, so just this option prevents it from work (if the stream is nor properly timestamped, the receiver doesn't have anything to synchronize to). --param resampler=soxr is needless, but if resampler was used, speex is better because it doesn't add a 20 ms delay. I also don't see any advantage of --audio-channel-map 0:0,1:1,2:2,3:3,4:4,5:5,6:6,7:7 --audio-scale none.

Please start with some minimal working example, something like:

uv -t decklink -c lavc:enc=libx265 -s emebedded --param incompatible
uv -d decklink:syncrhonized -r analog

(eventually adding other options only if really necessary). I've just tried :

UltraGrid-continuous-x86_64.AppImage -c lavc:enc=libx264 -s embedded -t testcard -r embedded -d decklink:sync --param incompatible

and it doesn't seem to produce errors; using -t decklink with Hi59 signal also seem to work.

alatteri commented 1 year ago

Hi Martin,

In my previous post, the audio delays were an oversight from prior behavior when I had to try and sync things manually.

I've found that on encoder, delay:N:frames should match "frame threads=N".
I use soxr because in past testing, speex created clicking as it goes thru variable sample rates.
see here: https://github.com/CESNET/UltraGrid/pull/225#issuecomment-1292205857
Regarding --audio-channel-map, sometimes I have to re-wire the channels to put 5.1 in the proper place. Or sometimes take stereo channels that are in 7/8 and put them into 1/2 for scenarios where there is only stereo speakers. In this instance, you're right, it does nothing.
Regarding --audio-scale none, if I call channel-map, I believe I had to tell audio-scale none or else it was changing the levels, or something, I don't remember exactly, but it was not right.

Anyway, below I've done a very minimal capture and and display. And while there are less errors than before, there are still distracting video frame holds or audio drops at the Decklink Missing/Dismissed frames.

both encoder/receiver: Intel NUC12 with UltraStudio 4K Mini Ubuntu 22 with HWE kernel 6.2

encoder:

 uv -m 1316 -t decklink -c libavcodec:encoder=libx265 -s embedded --param incompatible -P 5004
UltraGrid 1.8+ (tags/continuous rev cb269cf built Oct  6 2023 11:18:50)
[Decklink capture] Format change detected (display mode, color space - RGB444, 10bit).
[Decklink capture] Detected 10-bit signal, use ":codec=UYVY" to enforce 8-bit capture (old behavior).
[Decklink capture] Using codec: R10k
[DeckLink capture] Using limited range R10k as specified by BMD, use '--param bmd-r10k-full-range' to override.
[DeckLink capture] Enable video input: 1080p24
[lavc] Selected pixfmt: gbrp10le
[lavc] Selected pixfmt has subsampling 4:4:4, which is usually not supported by hw. decoders
[lavc] Use ':subs=420' or ':safe' to disable.
[lavc] Selected color depth 10 b may not be supported by HW decoders.
[lavc] Use ':depth=8' or ':safe' to disable.
[DeckLink capture] 117 frames in 5.01499 seconds = 23.33 FPS
[Audio sender] Sent 235920 samples in last 5.042061 seconds.
[Audio sender] Volume: [0] -20.94/-5.61 dBFS RMS/peak
[DeckLink capture] 120 frames in 5.0001 seconds = 23.9995 FPS
[Audio sender] Sent 242000 samples in last 5.041412 seconds.
[Audio sender] Volume: [0] -19.83/-6.19 dBFS RMS/peak
[DeckLink capture] 121 frames in 5.0414 seconds = 24.0013 FPS
[Audio sender] Sent 242000 samples in last 5.041240 seconds.
[Audio sender] Volume: [0] -19.18/-4.44 dBFS RMS/peak
[DeckLink capture] 121 frames in 5.04192 seconds = 23.9988 FPS
[Audio sender] Sent 240000 samples in last 5.000121 seconds.
[Audio sender] Volume: [0] -17.04/-1.21 dBFS RMS/peak
[DeckLink capture] 121 frames in 5.04159 seconds = 24.0004 FPS
[Audio sender] Sent 242000 samples in last 5.041113 seconds.
[Audio sender] Volume: [0] -18.11/-1.80 dBFS RMS/peak

receiver:

uv -d decklink:synchronized -r analog -P 5004
UltraGrid 1.8+ (tags/continuous rev cb269cf built Oct  6 2023 11:35:20)
[video dec.] Detected compression properties: RGB 4:4:4 10 bit
[Decklink display] Using limited range R10k as specified by BMD, use '--param bmd-r10k-full-range' to override.
[Decklink display] Selected mode: 1080p24
[Decklink display] bmdDeckLinkConfig444SDIVideoOutput set to: true
[Decklink display] bmdDeckLinkConfigSDIOutputLinkConfiguration set to: bmdLinkConfigurationSingleLink
[display] Successfully reconfigured display to 1920x1080 @24.00p, codec R10k
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] Missing frame
    Last message repeated 1 times
[Decklink display] Dismissed frame, buffered: 5
[Decklink display] 121 frames in 5.03744 seconds = 24.0201 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -16.17/-1.21 dBFS RMS/peak
SSRC 0x8ae66695: 5504/5504 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 768/768 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] 121 frames in 5.03436 seconds = 24.0348 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -19.50/-3.80 dBFS RMS/peak
SSRC 0x8ae66695: 4864/4864 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] 120 frames in 5.00354 seconds = 23.983 FPS
[Audio decoder] Received 960000/960000 B, decoded 240000 samples in 5.00 sec.
[Audio decoder] Volume: [0] -19.61/-3.43 dBFS RMS/peak
SSRC 0x8ae66695: 5120/5120 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] Missing frame
[Decklink display] Dismissed frame, buffered: 5
[Decklink display] 121 frames in 5.04142 seconds = 24.0012 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -19.20/-2.26 dBFS RMS/peak
SSRC 0x8ae66695: 6144/6144 packets received (100.0000%), 0 lost, max loss 0
Video dec stats (cumulative): 743 total / 716 disp / 27 drop / 0 corr / 0 miss
Audio dec stats (cumulative): 744 played / 744 total audio frames
SSRC 0x938db2ae: 768/768 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] 121 frames in 5.03243 seconds = 24.0441 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -15.30/-1.34 dBFS RMS/peak
SSRC 0x8ae66695: 6144/6144 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] 120 frames in 5.00371 seconds = 23.9822 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -23.13/-9.52 dBFS RMS/peak
SSRC 0x8ae66695: 3712/3712 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 768/768 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] Missing frame
[Decklink display] Dismissed frame, buffered: 5
[Decklink display] 121 frames in 5.03776 seconds = 24.0186 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -29.88/-10.58 dBFS RMS/peak
SSRC 0x8ae66695: 3072/3072 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] 120 frames in 5.00292 seconds = 23.986 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -inf/-inf dBFS RMS/peak
SSRC 0x8ae66695: 2944/2944 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] 120 frames in 5.01621 seconds = 23.9224 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -28.02/-11.03 dBFS RMS/peak
SSRC 0x8ae66695: 4480/4480 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 768/768 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] Missing frame
[Decklink display] Dismissed frame, buffered: 5
[Decklink display] 121 frames in 5.03538 seconds = 24.03 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -24.31/-10.03 dBFS RMS/peak
SSRC 0x8ae66695: 4864/4864 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
Video dec stats (cumulative): 1488 total / 1461 disp / 27 drop / 0 corr / 0 miss
Audio dec stats (cumulative): 1489 played / 1489 total audio frames
[Decklink display] 121 frames in 5.04033 seconds = 24.0064 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -24.15/-6.81 dBFS RMS/peak
SSRC 0x8ae66695: 4864/4864 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] 121 frames in 5.03845 seconds = 24.0153 FPS
[Audio decoder] Received 960000/960000 B, decoded 240000 samples in 5.00 sec.
[Audio decoder] Volume: [0] -22.67/-5.28 dBFS RMS/peak
SSRC 0x8ae66695: 8320/8320 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 768/768 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] Missing frame
[Decklink display] Dismissed frame, buffered: 5
[Decklink display] 121 frames in 5.03034 seconds = 24.054 FPS
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -20.83/-1.85 dBFS RMS/peak
SSRC 0x8ae66695: 3584/3584 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] 120 frames in 5.01473 seconds = 23.9295 FPS
[Audio decoder] Received 960000/960000 B, decoded 240000 samples in 5.00 sec.
[Audio decoder] Volume: [0] -21.24/-7.90 dBFS RMS/peak
SSRC 0x8ae66695: 3968/3968 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 768/768 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] 121 frames in 5.03546 seconds = 24.0296 FPS
[Audio decoder] Received 960000/960000 B, decoded 240000 samples in 5.00 sec.
[Audio decoder] Volume: [0] -25.11/-5.85 dBFS RMS/peak
SSRC 0x8ae66695: 4352/4352 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
[Decklink display] Missing frame
    Last message repeated 1 times
[Decklink display] 117 frames in 5.01317 seconds = 23.3385 FPS
[Decklink display] Dismissed frame, buffered: 5
[Decklink display] Dismissed frame, buffered: 6
[Audio decoder] Received 968000/968000 B, decoded 242000 samples in 5.04 sec.
[Audio decoder] Volume: [0] -25.84/-8.37 dBFS RMS/peak
SSRC 0x8ae66695: 6912/6912 packets received (100.0000%), 0 lost, max loss 0
SSRC 0x938db2ae: 896/896 packets received (100.0000%), 0 lost, max loss 0
Video dec stats (cumulative): 2233 total / 2206 disp / 27 drop / 0 corr / 0 miss
Audio dec stats (cumulative): 2234 played / 2234 total audio frames

using --audio-codec=AAC:bitrate=256K with synchronized definitely makes things even worse with a constant clicking noise. I thought at one point I saw a commit that fixed compressed audio with synchro.
https://github.com/CESNET/UltraGrid/issues/326#issuecomment-1660102837

Using PCM (not explicitly defining a codec) increases the data load by about 12mb/s for 8 channel SDI, although the constant clicking is gone, but still get the frame drops as above.

Thank you again for everything.

MartinPulec commented 1 year ago

Hi,

I've found that on encoder, delay:N:frames should match "frame threads=N".

I see, yes, it does. But it isn't defined for the synchronized playback. As such, it may even break synchronized altogether because it could drop timestamps.

I use soxr because in past testing, speex created clicking as it goes thru variable sample rates.
  see here: [Added dynamic resampling #225 (comment)](https://github.com/CESNET/UltraGrid/pull/225#issuecomment-1292205857)

Yes, I actually recall it. It is set by default with drift-fix.

Regarding --audio-channel-map, sometimes I have to re-wire the channels to put 5.1 in the proper place.  Or sometimes take stereo channels that are in 7/8 and put them into 1/2 for scenarios where there is only stereo speakers. In this instance, you're right, it does nothing.

I think it gives decoder slightly more work, but nothing significant I believe (may be a concern just for plenty of channels).

Regarding --audio-scale none, if I call channel-map, I believe I had to tell audio-scale none or else it was changing the levels,  or something, I don't remember exactly, but it was not right.

If there is only the remapping, it shouldn't scale. It might only if there is you are mixing 2 or more channels together to prevent overflows (we also do not do this often so I am not sure how good it works now).

Anyway, below I've done a very minimal capture and and display. And while there are less errors than before, there are still distracting video frame holds or audio drops at the Decklink Missing/Dismissed frames. $ uv -m 1316 -t decklink -c libavcodec:encoder=libx265 -s embedded --param incompatible -P 5004 $ uv -d decklink:synchronized -r analog -P 5004

I'll try if I am able to reproduce. The thing is that it is quite timing sensitive, which could be eg. influenced by the content of the video and/or network.

using --audio-codec=AAC:bitrate=256K with synchronized definitely makes things even worse with a constant clicking noise. I thought at one point I saw a commit that fixed compressed audio with synchro. #326 (comment)

Interesting, I'll try to look into it as well. IIRC it may not be related but I can be wrong. As noted there, I've noticed that only Opus was affected because it is not self-delimiting, which means that the new format, where there are multiple OPUS frames with the same timestamp doesn't work with older UG receiver. When I've been testing, AAC was not entirely content (producing warnings) but seem to decode properly, at least from output

Using PCM (not explicitly defining a codec) increases the data load by about 12mb/s for 8 channel SDI, although the constant clicking is gone, but still get the frame drops as above.

Thanks for the info.

alatteri commented 1 year ago

Hi Martin,

Can anything be done about the dropped missed frames? I don't understand the syntax to adjust the synchronized parameters, but maybe tuning those would help?

Also, please listen to this attached recording of the clicking when using synchro with AAC. Most noticeable in the beginning of the clip. clicking.m4a.zip

MartinPulec commented 1 year ago

Hi,

clilcking

I was able to reproduce. #345

missing/dismissed frames

Unfortunately I was still not able to reproduce. Just to exclude the network, is it reproducible when running just on the receiver?

 uv -c lavc:enc=libx264 -s embedded -t testcard -r embedded -d decklink:sync --param incompatible

If it is, which parameter added (toward you setting) causes it to start malfunctioning.

If the problem is caused by the network jitter, you can try something like -d decklink:sync=3,10 (increased the maximal buffered frame count to 10).

alatteri commented 1 year ago

Hi Martin,

Thanks for taking a look, and I saw the new issue. Strange it is seemingly only 24fps.

Regarding the Decklink sync dismissed frames. In the log output above, it shows 100% packet received, so would that not eliminate the network as an issue?

Could you explain the options for syncro some more please? I don't understand what the 2 values mean. :sync=3,10 increasing p can help if sender clock is faster, b in the opposite case

Thanks, Alan

Hi,

clilcking

I was able to reproduce. #345

missing/dismissed frames

Unfortunately I was still not able to reproduce. Just to exclude the network, is it reproducible when running just on the receiver?
 uv -c lavc:enc=libx264 -s embedded -t testcard -r embedded -d decklink:sync --param incompatible
If it is, which parameter added (toward you setting) causes it to start malfunctioning.

If the problem is caused by the network jitter, you can try something like -d decklink:sync=3,10 (increased the maximal buffered frame count to 10).

MartinPulec commented 1 year ago

Strange it is seemingly only 24fps.

It is not only 24 but actually for every frame-time not divisible by 20 ms. Unfortunately, I've tested with 25p only and it works nicely there, because the audio frame is evenly divisible by 20 ms. Where it isn't, it is more difficult.

Params - first value is the initial filling, you can try less than 3 but generally it doesn't work well. More gives you better stability at the expense of higher latency. The second value is the maximal buffer size, there is nothing other about it. The delay then fluctuates somewhere between those 2 values - if it would drop below, last frame is re-scheduled, in the opposite case, the frame is dropped.

alatteri commented 1 year ago

OK this is working well except this the following below condition.

For reasons unknown, sometimes the encoder goes just under realtime. When this happens we get the Decklink Missing Frame warning.

The end result is that for a frame or two, the video stutters, this is expected, but the audio cuts out. My recollection of behavior prior to "sync", is that the audio would keep going. Having the audio drop is actually more disturbing that a frame or two of video stutter. There is probably no way around that with "sync", but if there was a "keep audio playing" mode that would be good.

Encoder: [2023-10-22 16:15:03] [DeckLink capture] 119 frames in 5.04157 seconds = 23.6038 FPS

Receiver: [2023-10-22 16:15:05] [Decklink display] Missing frame [2023-10-22 16:15:05] [Decklink display] Missing frame

I'm also testing Ubuntu 23.10 on NUC12, since oneVPL is supported using native repos, and the newer 6.5 kernel may be a performance regression. With Ubuntu 22.04, I don't recall having encoder FPS drops like I am seeing now.

alatteri commented 1 year ago

I believe this to be solved. Closing.

TheSashmo commented 12 months ago

Hey everyone, I'm finally back in the office and can do some testing on this with a professional lip sync analyzer.

I tired all of the above and was not able to get a stable test setup with either continuous or release. The option --param incompatible does not exist, or at least I am putting it in the wrong spot, which I don't think I am.

Just as fyi using UG on a local network or the same machine has somewhere between 80-110ms of drift in the audio. I was told broadcast industry standard is within one frame, sometime accepting two frames of miss-match.

Here is what I am testing:

./UltraGrid-continuous-x86_64.AppImage -t decklink:4 -c libavcodec:encoder=libx264:bitrate=8000k -s embedded --audio-codec=MP3:sample_rate=48000:bitrate=128k --audio-capture-format channels=2 192.168.99.123 -P 10000

and

./UltraGrid-continuous-x86_64.AppImage -d decklink:sync:device=0 -r embedded 192.168.99.123 -P 10000

When running sync option I get flooded with overflow messages and occasional missing frame messages.

Surprisingly when running without the sync option it works "as UG should" but as of the last release and continious build, now I am seeing random decoding errors:

[lavc h264 @ 0x7f339413c0c0] corrupted macroblock 69 24 (total_coeff=-1) [lavc h264 @ 0x7f339413c0c0] error while decoding MB 69 24 [lavc h264 @ 0x7f339413c0c0] negative number of zero coeffs at 86 35 [lavc h264 @ 0x7f339413c0c0] error while decoding MB 86 34

Suggestions?

MartinPulec commented 11 months ago

I tired all of the above and was not able to get a stable test setup with either continuous or release

I've opened a new issue #362, please refer there.

CESNET / UltraGrid

Feature Request: Video and Audio synchronized #326