CESNET / UltraGrid

UltraGrid low-latency audio and video network transmission system
http://www.ultragrid.cz
Other
499 stars 53 forks source link

Audio and Video display in single pipe #333

Closed mikekoetter closed 11 months ago

mikekoetter commented 1 year ago

How difficult would it be to add a display that could output the audio and video together in something like the NUT format over or unix or fifo pipe that could easily be brought into FFMPEG?

Originally posted by @mikekoetter in https://github.com/CESNET/UltraGrid/discussions/191#discussioncomment-6787400

MartinPulec commented 1 year ago

I think we can add a file output using FFmpeg libavformat, which would be able to output NUT, quite easily.

mikekoetter commented 1 year ago

does a file output already exist, or would this be a new -protocol option like sdp?

MartinPulec commented 1 year ago

does a file output already exist, or would this be a new -protocol option like sdp?

It doesn't, but I believe that it won't be difficult to implement, so we could be able to implement it during next week. It won't be a protocol but rather a display.

txtpauld commented 1 year ago

Hello! I would also love this feature to be able to stream the uncompressed video from a decklink card using this nut format.

We would like to be able to take a decklink rgb 10 bit or ycbcr 422 10 bit signal and send it via ultragrid and then also to ffmpeg.

We would like to keep it uncompressed when it gets to ffmpeg so we can compress it using workflows we have already developed that make multiple renditions of the media with different audio and video formats without having to re encode everything in separate ffmpeg processes.

12bit rgb will be coming down the pipe as well in the future.

Thanks for the help!

MartinPulec commented 1 year ago

I've just added the implementation of file video display, which is basically a libavformat muxer, to the Git (thus being present in continuous builds).

Remarks:

  1. the output format (both container and codecs) are deduced from given output file name:

    a. if it is unknown or NUT, the NUT container will be used with uncompressed audio and video. For the video, FFmpeg-mapped pixel formats are supported (usual ones like RGB(A), UYVY). Additionally v210, R10k and R12L is supported (but conversion to compatible FFmpeg formats is necessary, which can be potentially slow).

    b. for known compressed containers (like MP4, MKV), default container video and audio compressed codec is used (just like if it would be processed with ffmpeg command: ffmpeg -i <input> out.mp4), also 8-bit 4:2:0 YUV is configured for the video compression.

  2. A/V is synchronized implicitly (as it is usual in UltraGrid). If it starts being unsynced, video frame is either dropped or dupped. The frame insertion or deletion is currently done if the difference is 1 video frame time - this may be too restrictive in some use cases (but can be tweaked).

Output file name can be of course a FIFO, if muxer supports non-seekable file.

mikekoetter commented 11 months ago

I'm doing some testing on this - would you like feedback in this thread?

MartinPulec commented 11 months ago

That would be great, feel free also to reopen if needed. I've just closed the thread to let know that we are not doing anything more at the time.

mikekoetter commented 11 months ago

I'm not sure the max_av_diff option is working... also not sure the default 1.0 is working either. I'm continuing to get these desync errors that are well below 1.0 [File disp.] A-V desync 0.051507 sec, video frame dropped... It's a little hard to reproduce but I'll try to get some more specific reproducible information

mikekoetter commented 11 months ago

oh wait - I think maybe I am misinterpreting the argument here. I (mistakenly) understood the max_av_diff value to be a float value in seconds, as it is represented in the error. Is it actually in frames? Where the error above is representing a delta of 1.236168 frames ( hence greater than the 1.0 default) ?

MartinPulec commented 11 months ago

Is it actually in frames?

yes

I've just changed it to seconds to correspond with the warning.

It is also increased it to 0.084 s (84 ms; ~2 24p frames) to better fit eventual timing jitter (most likely due to video compression). I don't know your use case but I believe that it may have been due to this.

Just a small explanation – when writing to the output, a timestamp is assigned to the output, but it is (currently) made up by UG to avoid continuously increasing desync if audio and video is not properly synchronized (it mimics UG normal behavior - the frames are presented as soon as possible, so AV is synchronized implicitly).