Avoid output amplitude clipping

jurihock / stftPitchShift

STFT based real-time pitch and timbre shifting in C++ and Python

MIT License

115 stars 14 forks source link

Avoid output amplitude clipping #10

Closed jurihock closed 2 years ago

jurihock commented 2 years ago

Currently the output amplitude values are clipped to [-1, +1]. Instead of this detect the max. absolute out-of-range amplitude value and normalize the whole output file.

See also:

jurihock commented 2 years ago

An optional RMS normalization against the original signal could be generally useful to equalize the total loudness.

Instead of an optional peak normalization, a limiter would make more sense. However, a limiter requires additional parameters, at least attack and release. This would clutter the CLI interface...

jurihock commented 2 years ago

According to Parseval's theorem the RMS normalization can also be performed in the frequency domain, e.g. at each STFT step:

Pro: continuous real-time normalization would be possible
Contra: additional computational effort due to STFT oversampling

On top of that and in case of multi pitch shift, the RMS normalization can be performed even after each partial resampling and on envelope too, if the formant preservation is enabled. But however, this could affect the multi pitch shifting result (argmax).

jurihock commented 2 years ago

Spectral RMS normalization in C++ 998e7a3 seems to be quick enough, so yeah, will do it in Python too...

jurihock commented 2 years ago

The spectral frame-wise RMS normalization seems to operate smoother, compared to the equivalent time domain RMS normalization of the whole audio file.

The signal peaks remains within limits, so currently there is no need to invest time for a limiter implementation...