Redesign of mixing pipeline

fightling commented 5 years ago

In this issue I will describe and discuss my current work at branch features/onepipe.

As mentioned in #58 voctomix has some issues with A/V synchronization. The current approach to solve this is to get rid of the inter elements which seem to fuck up the time stamping. My first solution was to replace them by interpipe elements but that didn't work out.

My current approach is to replace the construction with multiple pipelines which have to be connected through inter or interpipe by a single pipeline. I think this is where gstreamer "feels" most comfortable and it also simplifies the whole pipeline a lot.

I've made a prototype at feature/onepipe which currently only works with test sources.

The other sources Decklink, TCP/IP and Image have to be adapted. But while inspecting the old pipelines I discovered some constructs which I want to replace. So I've made a picture of the whole mixer as I see it and I would like to discuss it with you before I change the mixer construction to do some things better than the could be done before.

New Pipeline

As you can see in the graph I've placed the voctocore beside it's users voctogui, Internet and the encoders in vertical lanes:

schema

Short legend:

Red elements describe audio content
Blue elements describe video content
Green elements describe A/V content
Bold elements or thick lines describe multiplicities (for example there are multiple audio and video sources)
Dashed lines describe internal command control access
Big circles are ports like A/V Sources or A/V Outputs
Small circles are tees

UI

The UI uses a command interface to interact with the core over TCP/IP, displays preview videos of all sources and the mix. All previews are non-blanked by any pause or nostream sources, so that the user can see the non-blanked streams.

Internet

The Internet gets the live streaming output which is blanked by any pause (or nostream) source when the user selects it. To implement that in the pipeline the output is connected to additional compositors which can blank the stream. For the Mix Live output it uses Live Compositor to blank video and Live Audiomixer to blank the audio streams.

The Source Live outputs shall be live streams of one ore more sources which can be blanked through the Source Compositor. The all get the same audio output which come from the Live Audiomixer so that you - for example - can stream the slides and hear the talk's audio on that stream.

Encoder

The encoders get the Mix Output and Source Output which both are non-blanked. The Source Output is a single mux from all video and audio sources into one stream.

voctocore

SB Video / SB Audio

Multiple stream blanker sources like pause, nostream and similar. Usually these content comes over TCP/IP or is read from a local A/V file.

Video Source / Audio Source

Multiple incoming video and audio sources from decklink or grabber input.

Recording Compositor

Processes the current live composite (set by voctogui) and mixes the currently selected video sources together.

Recording Audiomixer

Mixes all audio sources together like set up by voctogui.

Live Compositor

Takes the live video mix and blanks it if set by voctogui.

Live Audiomixer

Replaces the live audio mix with the current selected stream blanker audio.

Source Compositors

Take the source videos and blanks them if set by voctogui.

Rescale

Re-scales the video mix or video sources for preview in voctogui.

Mux

Muxes one (or multiple when bold) audio and video streams into one stream.

Source Live

Live streaming A/V output of one ore multiple sources (may be blanked).

Mix Live

Live streaming A/V output of the mix (may be blanked).

Mix Preview

Re-scaled A/V stream of the mix (non-blanked)

Source Preview

Re-scaled A/V stream of all sources (non-blanked)

Mix Output

A/V stream of the mix (non-blanked) for recording

Source Output

A/V stream of all sources (non-blanked) in one stream for recording.

fightling commented 5 years ago

@MaZderMind: Can you take a look if this pipeline matches the original requirements? And are there any issues with mutli-audio from your point of view?

I also would like to discuss who this pipeline shall be configured. Which things are fixed and which do you want to configure? Like what sources end up in the source recording.

MaZderMind commented 5 years ago

Maybe it is missing the Video-Only source for the backdrop, although technically this is also a VideoSource.
In the SourcePreview branch the JPEG-Encoder is missing after the Rescale.
Currently the "Source Compositor" is a specific thing for Slides and used specificly for the Slide-Only-Stream; it is not required for the other sources

Multiaudio is currently implemented in an unnecessary complicated way, because the inter*elements can only carry stereo-audio (ie two tracks). Without them –in your new pipeline design– every port weil just handle 2, 8 or 16 tracks, as many are needed. All other elements used are capable of handling an arbitrary number of channels.

The actual mapping which channel is used for what could be made on the input or output boundary. For the input probably some kind of channel mapping is required, to allow for scenarios like this:

non-translated-audio L/R input to cam1 (channels 0 & 1)
translated audio L/R input to cam2 (channels 0 & 1)
both translations on the output (channels 0 & 1 non-translated, channels 2 & 3 translated) This could be done locally in the sources and the rest of the pipeline would just transport 4 channels.

On the output-side the mapping from channels to tracks could be made in the recording/streaming sink-scripts, voctomix would just preset the configured number of channels.

fightling commented 4 years ago

solved in voctomix2

voc / voctomix