Closed jurihock closed 2 years ago
An optional RMS normalization against the original signal could be generally useful to equalize the total loudness.
Instead of an optional peak normalization, a limiter would make more sense. However, a limiter requires additional parameters, at least attack and release. This would clutter the CLI interface...
According to Parseval's theorem the RMS normalization can also be performed in the frequency domain, e.g. at each STFT step:
On top of that and in case of multi pitch shift, the RMS normalization can be performed even after each partial resampling and on envelope too, if the formant preservation is enabled. But however, this could affect the multi pitch shifting result (argmax).
Spectral RMS normalization in C++ 998e7a3 seems to be quick enough, so yeah, will do it in Python too...
The spectral frame-wise RMS normalization seems to operate smoother, compared to the equivalent time domain RMS normalization of the whole audio file.
The signal peaks remains within limits, so currently there is no need to invest time for a limiter implementation...
Currently the output amplitude values are clipped to
[-1, +1]
. Instead of this detect the max. absolute out-of-range amplitude value and normalize the whole output file.See also: