Enh/runner audio normalization

Motivation and Context

Previous PR: #1230 As mentioned in #1206: The range of the volume varies across the recordings, which leads to a poorer user experience in some cases, and has caused problem to the silence-skip feature.

Description

As discribed in #1230: Applies the loudnorm filter of ffmpeg during transcoding. Parameters are chosen according to the EBU recommendation R128.

Results of some experiments with this method, from here:

Normalization using the loudnorm filter in FFmpeg has achieved good results: The graphs below show the changes in volume over time for three video recordings, comparing the original version to the normalized version. The stashed red line marks -15dB, which is now used as the threshold for silence detecting -- and is indeed suitable for the normalized audio, according to the result to these experiments.

The three recordings, after audio normalization, all sound good to me. The volume scatter points in the graph represent the maximum volume of a serial of segments (lasting from 3 to 8.333333 seconds), obtained by the ffmpeg volumedetect filter.

Steps for Testing

Further tests can be done with the test of the runner.

TUM-Dev / gocast

Enh/runner audio normalization #1329

Motivation and Context

Description

Steps for Testing