CESNET / UltraGrid

UltraGrid low-latency audio and video network transmission system
http://www.ultragrid.cz
Other
492 stars 53 forks source link

Control Port command question #219

Closed TheSashmo closed 1 year ago

TheSashmo commented 2 years ago

I can get the control port to work and get stats. It works fine. Not sure if missed it, (typical of many of my posts here), what is the control port command to adjust the audio delay? I know I can do it in the STDOUT locally, but how can I adjust it remotely?

MartinPulec commented 2 years ago

It should be av-delay <amount_in_ms>. It is not in the wiki, I'll add.

TheSashmo commented 2 years ago

Thanks! I guess I could have looked at the code base too....

I might have brought this up before, but is there a way to align this closer? I can measure between 100-150ms vairance no matter what I do, on each run time, and I have to manually adjust compensate each time. Doing so, adds more delay on the SDI output. Even though I am doing -140ms to delay the audio, that means the video is ahead, but I see that my overall latency goes up.

MartinPulec commented 2 years ago

Which devices do you use? From my perspective it is interesting that audio is behind video - this shouldn't usually happen.

Even though I am doing -140ms to delay the audio, that means the video is ahead, but I see that my overall latency goes up.

This is also interesting - that the video latency goes up, this is inevitable, since it is how this works. However audio latency should be the same.

TheSashmo commented 2 years ago

Sorry maybe I didn't make myself clear in the previous post. It's not behind. It's about 140ms early. Considering that the audio packets are smaller than video, which is logical, that they would arrive first. But I am having to delay the audio to match the video about 100-150ms each session. Regardless of 1-2-8-16 channels the result is the same.

I am using a VALID a.k.a Matchbox lip sync generator and analyzer. It generates up to 16 channels of audio and tone on SDI. I use that as my source at the encoder. I use the same device with local playback and that the SDI, feed that back to the analyzer and thats how it gives me a report of the difference. Again, this is all local encode and playback.

I can also see that there is a variance at times of 20-30ms swings, so its not consistent at the same rate all the time.

I would be more than happy to get you a test setup if you would like to remotely see what I have going on?

MartinPulec commented 2 years ago

Understood, that makes sense - well, it is not only due to packet size but audio device usually have also lower delay. Difference 100-150 ms is quite a lot, but depending on video compression it may be true. As for the variance I can imagine that it may be given by the video compression when the output is not strictly locked... But that would be perhaps the case of OpenGL but not BlackMagic.

What exactly is the input and output audio/video devices (you can paste commands directly)?

TheSashmo commented 2 years ago

Encode /usr/bin/uv -t decklink:connection=SDI:device=0 -c libavcodec:codec=H.264:encoder=libx264:bitrate=20000k:preset=ultrafast -s embedded --audio-codec=MP3:sample_rate=48000:bitrate=128k --audio-capture-format channels=8 -l unlimited -m 1316 -P 50000 127.0.0.1

Decode /usr/bin/uv -d decklink:device=0 -r embedded --control-port=2138 127.0.0.1 -P 60000

After testing this weekend I was back to basics with the exact same problem of the skipping frames...... I had to force it back down to 8-Bit 4:2:0 to get a stable/clean output... There has to be something I am doing wrong here, I just can't figure out what the problem is.

TheSashmo commented 1 year ago

Bringing this back to the top of the issues chain. No matter what I do, I am having a massive problem trying to initiate a local encode/decode session and have less than 16ms (one frame) of a/v sync. I have tried every possible codec, and the only codecs that gave me perfect a/v sync out of the box was a-law (without drift fix) and PCM (which was occasionally drifting then getting back).

https://capture.dropbox.com/2GKdEZTpCVaEH8BQ

@MartinPulec can you suggest any other way to keep the a/v sync under 16ms +/- without having to configure each decode separately?

On average using OPUS (the most stable from my tests)I am having to compensate about 30ms to -120ms in av-delay to sync everything, and I am still getting time to time audio buffer overflow with or without drift fix. The screenshot above shows those tests, but I can re-run that whole setup chain and have totally different results. In the use case where I am testing, clients are expecting under +/- 10ms for A/V accuracy. While this dosnt make sense to me, as one frame of 59.94 interlaced is 32 ms, the expectations is 10. I can convince them to 16ms if we were to use progressive, but none the less every session that I start measures differently without any rhyme or reason why its not always the same.

Side note: av-delay via control port, does not send a message to STDOUT that the delay has changed, so when sending a message via control port there is no way (what I know) to know the current value of what it was set to, while when using the keyboard controls it reports back what the new value was set to.

Since the variations can be between 30 to 120, I would use keyboard controls to adjust the audio 10ms at a time, and when I got to 80ms of compensation, the next 10ms would jump the measurements to 40ms more. So in short, it would the decoder compensation would work 10ms at a time till 80 or 90 and then each 10ms adjustment would throw the decoder measurement to 30 or 40 off... To me more clear:

A/V measurement: 80ms audio ahead of video send keyboard control -10ms A/V measurement: 70ms audio ahead of video send keyboard control -10ms -10ms -10ms -10ms -10ms -10ms (total 60ms A/V measurement: +30ms audio ahead of video send keyboard control +30ms A/V measurement: 20ms audio ahead of video send keyboard control -10ms A/V measurement: -20ms audio ahead of video send keyboard control +10ms A/V measurement: 30ms audio ahead of video

No matter what I do, I can not get into my 10ms +/- window, and on top of that I have to do that for every decoder that I have receiving the signal. Is there not a better way to handle this?

Reminder I can give you access to my test setup if needed.

alatteri commented 1 year ago

I have found that even with drift fix, when signal goes from having audio content to silence, then the buffer overflow happens, but since it is silence anyway, doesn't really matter. Were you looping content that has some black in beginning or end?

and I am still getting time to time audio buffer overflow with or without drift fix

But yes overall, I find that UG audio sync does seem to be kinda random, even between runs on same hardware. While there are many factors that affect this, I wonder if the fact that Video and Audio are seemingly not "tied together", like MPEG-TS, can add some additional vectors for this to happen.

MartinPulec commented 1 year ago

@MartinPulec can you suggest any other way to keep the a/v sync under 16ms +/- without having to configure each decode separately?

Generally speaking no, the audio is not tied to video in any way. The synchronization is kept implicitly by the latency.

Side note: av-delay via control port, does not send a message to STDOUT that the delay has changed,

correct, logged since 046c6ce7

so when sending a message via control port there is no way (what I know) to know the current value of what it was set to, while when using the keyboard controls it reports back what the new value was set to.

Since the variations can be between 30 to 120, I would use keyboard controls to adjust the audio 10ms at a time, and when I got to 80ms of compensation, the next 10ms would jump the measurements to 40ms more. So in short, it would the decoder compensation would work 10ms at a time till 80 or 90 and then each 10ms adjustment would throw the decoder measurement to 30 or 40 off... To me more clear:

A/V measurement: 80ms audio ahead of video send keyboard control -10ms A/V measurement: 70ms audio ahead of video send keyboard control -10ms -10ms -10ms -10ms -10ms -10ms (total 60ms A/V measurement: +30ms audio ahead of video send keyboard control +30ms A/V measurement: 20ms audio ahead of video send keyboard control -10ms A/V measurement: -20ms audio ahead of video send keyboard control +10ms A/V measurement: 30ms audio ahead of video

No matter what I do, I can not get into my 10ms +/- window, and on top of that I have to do that for every decoder that I have receiving the signal. Is there not a better way to handle this?

currently no. It was once a design decision, that audio and video in UG will be transmitted independently without an explicit synchronization (RTP allows synchronization in RTCP, anyways). You can open a feature request and we will discuss it internally, but I cannot promise anything since I suppose that it will require a lot of work.

The synchronization also directly affects the latency, because usually you need to buffer the data (more) or you would need to have separately synchronized and non-syhnchronized stuff.

MartinPulec commented 1 year ago

Assuming that the original question was answered with the second post, I'll close now. The rest is diverging from the original topic and will be handled rather in https://github.com/CESNET/UltraGrid/issues/326.