Open phoboslab opened 1 year ago
Maybe something that could be optional or configurable? Some of the audio files I’m thinking of using QOA for have a lot of noise to begin with that I would like to keep more or less intact, almost treating it like data rather than audible signal. But perhaps I misunderstand what this shaping does.
I can’t hear the difference in your example page with my cheapo headphones but in Audacity comparing files with Invert it becomes more noticeable. Will need to do some more tests.
This should only affect the quantization noise that is added by the encoder; it won't remove any noise that is present in the source. But yes, making it optional is certainly the right idea!
audio-formats has TDPF (courtesy of MIT-licensed Airwindows, tuned and modified by me to fit WAV) dithering in its QOA encoder now: https://github.com/AuburnSounds/audio-formats/blob/master/source/audioformats/qoa.d#L724 It was tuned for WAV. imo It's more important to get dithering levels right rather than get the best dithering. I will finetune the dither level for QOA encoding. (EDIT: errr, disabled for now, it sounds worse than without dithering)
I've added some very simple noise shaping to the encoder (to the noise_shaping branch). This does not change the decoder or the data format. The noise shaping should help to move quantization noise into the higher, less audible frequencies.
Here's a comparison page with all samples with and without noise shaping: https://phoboslab.org/files/qoa-samples/noiseshaping.html
The difference for some sample is night & day. Listen to
32_triangles-triangle_roll_stereo
at 00:43 or35_glockenspiel_arpegio_melodious_phrase_stereo
at 00:39.However, this noise shaping has an adverse effect for some other samples. I tried to contain it by only applying most of the shaping when our prediction is "bad" anyway. But still, I feel that some samples sound more "crunchy" now. Listen to
21_trumpet_arpegio_melodious_phrase_stereo
right at the beginning for instance. Vocals injulien_baker_sprained_ankle
and others also seem to have lost a bit of "smoothness".Maybe someone with better ears (and/or equipment :D) can take a listen? What's the usual strategy here, to adaptively correct for quantization noise?