dynamic mode: Change in integrated loudness shouldn’t result in a true peak which exceeds the target TP

mifi commented 11 months ago

Thanks for this awesome tool! I was having trouble finding info about the loudnorm filter in ffmpeg, but this repo is a wealth of knowledge.

I've been hit by the somewhat awkward implementation in ffmpeg where if the target LRA is lower than, it switches to "dynamic", causing the audio file to become turn completely quiet. Luckily you have already have a solution --keep-lra-above-loudness-range-target for that.

Now reading the https://ffmpeg.org/ffmpeg-filters.html#loudnorm I see that:

... the change in integrated loudness shouldn’t result in a true peak which exceeds the target TP. If any of these conditions aren’t met, normalization mode will revert to dynamic.

So I'm wondering have any of you thought about the possibility of dynamic mode getting accidentally triggered by this condition, and how to prevent that?

slhck commented 11 months ago

Good point, I guess what you mean is that the target integrated loudness should be capped to max(loudness target, measured_I + measured_LRA)?

I can add that as a safeguard, checking with @richardpl if that would suffice.

To be honest, I don't know the filter code well enough to give a definitive answer here.

mifi commented 11 months ago

Not sure exactly what ffmpeg means because I'm not super into audio terminology, but yes my worry is that we will accidentally trigger "dynamic" mode if some condition is met. Maybe we could look at ffmpeg source code too, and just mirror what it does

slhck commented 11 months ago

We actually use ffmpeg under the hood!

The loudnorm filter will do whatever it does according to the description that you linked to. The ffmpeg-normalize wrapper simply adds a bunch of convenience functions and options like the one to keep the LRA above the threshold.

I'll see if I can look at the source code to verify that there might be a problem with too large LRA values.

richardpl commented 11 months ago

There is already option to set custom non-default LRA, thus ensuring linear processing in 2nd pass. But if you use unrealistic target TP (too small value) it may not do linear processing at all.

mifi commented 11 months ago

I found this code:

            if ((offset_tp <= s->target_tp) && (s->measured_lra <= s->target_lra)) {
                s->frame_type = LINEAR_MODE;
                s->offset = offset;
            }

https://github.com/FFmpeg/FFmpeg/blob/7665139656280a2f77ee8d047dd998c1b78af7eb/libavfilter/af_loudnorm.c#L813

But I'm not sure what offset_tp and s->target_tp mean.

I tried to run a normalization using I=-5 (max allowed loudness) and tp=-2 (default value), making sure to set LRA to the measured LRA to prevent dynamic mode due to LRA too low. The output still sounds like the volume is being dynamically adjusted (volume seems to be going up and down). Not sure, but maybe I triggered dynamic mode. It's a pity that ffmpeg doesn't print any warning when dynamic mode gets enabled.

richardpl commented 11 months ago

linear mode is simple volume fixed gain knob. If you set loudness to extremes with also tp it can not do linear processing because its mathematically impossible to do linear processing in such case, dynamic mode is printed at end of processing, but sure it should give warning at start. (Thought you can guess by speed of processing of filter which mode is used currently)

slhck commented 11 months ago

Thanks for your comments, @richardpl!

I would agree that a warning printed at the beginning would be most useful. I don't think users will be able to tell by the processing speed.

mifi commented 11 months ago

If you ask me, I think when linear=true is explicitly specified, but the loudnorm filter cannot achieve linear processing, crashing would be even better than printing a warning and reverting to dynamic mode, but that's a breaking change.

palmarci commented 5 months ago

If you ask me, I think when linear=true is explicitly specified, but the loudnorm filter cannot achieve linear processing, crashing would be even better than printing a warning and reverting to dynamic mode, but that's a breaking change.

yes, same problem here, lot of my music library got nuked because of this :\

slhck commented 5 months ago

Sorry to hear that you have had some issues with your collection. Please note the entry in the FAQ on this: https://github.com/slhck/ffmpeg-normalize#should-i-use-this-to-normalize-my-music-collection — for music you want a ReplayGain-like algorithm.

I realize that a "set and forget" approach would be desirable, but it conflicts with the inner workings of the filter and the current behavior.

What I could imagine is adding a --linear option that forces linear processing and exits with an error if it can't be done. That requires determining beforehand whether linear/dynamic processing will be used, which is not perfectly feasible and error-prone.

That said, with https://github.com/slhck/ffmpeg-normalize/commit/fe96734aa0f9410d8d21fccd57484bf07a6e4ff2 you already get much clearer warnings about reversion to dynamic processing happening.

palmarci commented 5 months ago

No worries about my library, i think i should have a backup of it somewhere on a pendrive.

I have talked with a friend who does producing and I think I know the reason for this now. If you are asking to normalize a quieter song to a higher level, the filter obviously should add gain to it. However, there may be no headroom in the track, so linearly gaining a few dBs is not possible, since the audio would clip. Therefore the only possibility is to add that gain + a limiter at the end, which is what the dynamic processing does. So for example I have a track that is -12 LUFS and I want it to normalize to -10 LUFS, thats a +2 dB gain. I also set the Max True Peak to -0.1 dB to prevent clipping, but the audio does not have that 2 dB, so linear gain is not possible. The trick is to never add loudness, only to remove, so setting the target loudness to a low value (for example to -23 by default) should do it.

This explanation may be wrong a little, because I'm still trying to understand it all.

slhck commented 5 months ago

Yes, that is the explanation for why dynamic processing is needed, and why it may deteriorate quality (through limiting).

slhck / ffmpeg-normalize

dynamic mode: Change in integrated loudness shouldn’t result in a true peak which exceeds the target TP #251