TUM-Dev / gocast

TUMs lecture streaming service.
https://live.rbg.tum.de/
MIT License
180 stars 42 forks source link

Normalize the volume of recordings #1206

Open YiranDuan721 opened 9 months ago

YiranDuan721 commented 9 months ago

Is your feature request related to a problem? Please describe. The range of the volume varies across the recordings, which leads to a poorer user experience in some cases, and has caused problem to the silence-skip feature.

Describe the solution you'd like Normalize the volume of recordings. This could be done i.e. during audio transcoding.

Additional context Large-scale analysis of audio from existing recordings is required to support the introduction of an audio normalisation model.

YiranDuan721 commented 9 months ago

Alternatively, only raise the volume of recordings with too weak loudness.

YiranDuan721 commented 9 months ago

Normalization using the loudnorm filter in FFmpeg has achieved good results: The graphs below show the changes in volume over time for three video recordings, comparing the original version to the normalized version. The stashed red line marks -15dB, which is now used as the threshold for silence detecting -- and is indeed suitable for the normalized audio, according to the result to these experiments.

The three recordings, after audio normalization, all sound good to me. The volume scatter points in the graph represent the maximum volume of a serial of segments (lasting from 3 to 8.333333 seconds), obtained by the ffmpeg volumedetect filter.