Closed JRegimbal closed 2 years ago
LUFS (loudness units relative to full scale) is what is becoming more widely used in audio production to handle loudness, we should start to consider standards for all audio that we output and orient ourselves with how audio in UI is handled. This is not an absolute measure of loudness but aims at producing consistent loudness levels on a given output device.
We can address that after CSUN
Did some quick checks with the photo audio rendering and it seems like we're mostly about -29 to -28 LUFS with those. Line chart is -12.7 and autour is -10.1 LUFS. This makes sense since I've actually tried to spend time working on normalization for the photo ones.
The challenge is that LUFS is usually wrapped in tools for postproduction, also we have some more static "standard" audio elements like speech in TTS, and non-speech with to some degree unpredictable frequency content, so after CSUN we need to discuss a sort of strategy of how to handle all the different elements, and we need to research what LUFS is standard for auditory display or in our specific case screenreaders because we will want to match them
The new one sounds louder to me, which I find a bit annoying when getting the photo-audio renderings, since the progress is kind of boomy, then I have to "reach" a bit with my ears when I start the actual rendering.
Leaving final decision to @Cybernide but based on my quick test (only with photo-audio) I would recommend postponing this.
I say postpone - I'm swamped
Sounds good, this can go in after the supercollider code is adjusted to largely match loudness as well. I already have some progress towards it with the photo audio.
Fix #194. This just normalizes the MP3 files to -23 LUFS (loudness units relative to full scale). This sounds good to me personally, though that's not really a robust metric! Assigning to @jeffbl and @Cybernide for another perspective.