R-a-dio / valkyrie

R/a/dio software stack
https://r-a-d.io
MIT License
5 stars 3 forks source link

streamer: replay gain #65

Closed 9001 closed 6 months ago

9001 commented 4 years ago

should consider adding a gain while transcoding so all songs hit the same perceived volume

the analysis step would run once for each song, storing the measurements in the db

need to choose between two normalization approaches:

rms

rms normalization requires knowing the max and mean volume beforehand, obtainable with the following FFmpeg command (outputs max_volume and mean_volume):

ffmpeg -hide_banner -nostdin -i some.mp3 -af volumedetect -c:a pcm_s16le -f null - 2>&1 | grep -E '^\[Parsed_volumedetect_0 @ '

assuming our target mean_volume is -14 LUFS and a given song produces max=-3 and mean=-16, gain would be 2dB so that mean=-14 and max stays below zero (clipping otherwise)

ffmpeg -hide_banner -nostdin -v warning -i some.mp3 -map 0:a:0 -af volume=2dB -c:a libmp3lame -b:a 192k -compression_level 0 -ar 44100 rms.mp3

that's it for rms normalization (which is probably what we want) but including ebur128 too just in case

ebur128

ebur128 normalization requires knowing I, TP, LRA, thresh, offset obtained like this:

cfg="I=-14:TP=0:LRA=11"

ffmpeg -hide_banner -nostdin -i some.mp3 -map 0:a:0 -af loudnorm=print_format=json:$cfg -c:a pcm_s16le -f null - 2>&1

# "input_i" : "-10.87",
# "input_tp" : "0.17",
# "input_lra" : "7.80",
# "input_thresh" : "-21.02",
# "target_offset" : "-0.83"

then append the measured values to $cfg and normalize:

cfg="$cfg:measured_i=-10.87:measured_tp=0.17:measured_lra=7.80:measured_thresh=-21.02:offset=-0.83"

ffmpeg -hide_banner -nostdin -i some.mp3 -map 0:a:0 -af loudnorm=print_format=summary:linear=true:$cfg -c:a libmp3lame -b:a 192k -compression_level 0 -ar 44100 ebur128.mp3

i think I = perceived volume, TP = max permitted peak, LRA = dynamic range (max-min across output), not sure about thresh and offset

looks like I=-22 and TP=-3 is common in radio studios, LRA=18 or anywhere closer to 0

9001 commented 4 years ago

we probably want to use astats instead of volumedetect to calculate what gain to apply (more accurate values and works better with float input) so -af astats=measure_perchannel=none:measure_overall=none+Peak_level+RMS_level

value mapping:

volumedetect astats
max_volume Peak level dB
mean_volume RMS level dB

make sure to include the astats arguments, otherwise it will print those measurements for each channel (Channel: 1 , Channel: 2 , ...) in addition to the ones we actually want (Overall)

also make sure that astats is the first filter in the chain when analyzing, and likewise that volume is the first filter when applying the gain, because newer FFmpeg versions will try to repair clipping in the input by interpolating samples it thinks got btfo and many other filters will cast to int16 and discard that info

store the measurement output in the db as-is, however when calculating the gain to apply we probably want to min(0, Peak_level_dB) since analysis output above zero is the interpolated values and mostly dontcares

full analysis example with output:

ffmpeg -hide_banner -nostdin -i 'Angela - Shangri-La.mp3' -af astats=measure_perchannel=none:measure_overall=none+Peak_level+RMS_level -c:a pcm_s16le -f null - 2>&1 | grep -E '^\[Parsed_astats_0 @ .* dB:'
[Parsed_astats_0 @ 0x5636d42b1cc0] Peak level dB: 2.497428
[Parsed_astats_0 @ 0x5636d42b1cc0] RMS level dB: -10.613413

volumedetect on the same track, which shows the original values without the clip repair:

[Parsed_volumedetect_0 @ 0x557c4398fc00] mean_volume: -10.6 dB
[Parsed_volumedetect_0 @ 0x557c4398fc00] max_volume: 0.0 dB

so with a target amplitude of -11dB (too loud but just for this example) it would be safe to set -af volume=-0.387dB since the above-zero peak is from reconstructed samples, making the new "real" peak -0.387dB but the actual output will peak at 0dB (since it'll keep the reconstructed samples and then reclip w)

Wessie commented 6 months ago

the ebur128 variant has been running for a while now and I haven't noticed any particular artifacts while using it, dunno if @icxes has noticed anything while listening to his test setup.

But I think it would be fine to keep enabled otherwise and we can see if it causes issue once we go live