[Question] Working with compressed sounds

T145 commented 9 months ago

Hey! So I've tried to denoise an audio file that's been uncompressed from a tight archive using this program and your wavelets-ext example and have the following results:

Original:

Audio-Denoising: denoise

wavelets-ext:

Granted this audio doesn't have any background noise and computes an SNR of 100, but to me the denoising performed by this package seems to be an improvement over the original and doesn't have static crackles like the other option. The audio does sound slightly better than the original as well. Why then is this solution so much worse than the other? Is there a different algo I should be try when working with wavelets?

ghost commented 9 months ago

wavelet-ext uses cython to speed up computation. this repo and ext has few implementation changes and I don't remember which wavelet is being used in denoise but when I used this for my other app, wavelet-ext worked best for me

T145 commented 9 months ago

Which app is that? Is it on GitHub?

ap-atul commented 9 months ago

torpido

T145 commented 9 months ago

After some testing I did find some cases where the audio got messed up a bit. My primary sound collection is a bunch of mono WAV files that have variable length. Here's my touchup:

import warnings
import numpy as np
import soundfile as sf
import pywt

# https://pywavelets.readthedocs.io/en/latest/index.html
def denoise(in_wav: str, out_wav: str):
    info = sf.info(in_wav)  # getting info of the audio
    rate = info.samplerate

    warnings.simplefilter('ignore')
    warnings.simplefilter('error', RuntimeWarning)
    warnings.simplefilter('error', UserWarning)

    with sf.SoundFile(out_wav, "w", samplerate=rate, channels=info.channels) as of:
        for block in sf.blocks(in_wav, int(rate * info.duration * 0.10)):
            try:
                # Check fixes "UserWarning: Level value of 2 is too high: all coefficients will experience boundary effects."
                # All zero blocks seem safe to ignore since there's no visible spectrogram difference and the audio sounds slightly better.
                # The only concern is that longer pauses may be cut shorter.
                if not np.all(block == 0):
                    # Set axis=0 for mono audio
                    axis = 0 if info.channels == 1 else -1
                    coefficients = pywt.wavedec(block, 'db4', mode='per', level=2, axis=axis)

                    # getting variance of the input signal
                    sigma = mad(coefficients[- 1])

                    # VISU Shrink thresholding by applying the universal threshold proposed by Donoho and Johnstone
                    thresh = sigma * np.sqrt(2 * np.log(len(block)))

                    # thresholding using the noise threshold generated
                    coefficients[1:] = (pywt.threshold(i, value=thresh, mode='soft') for i in coefficients[1:])

                    # getting the clean signal as in original form and writing to the file
                    clean = pywt.waverec(coefficients, 'db4', mode='per', axis=axis)
                    of.write(clean)
                else:
                    of.write(block)
            except RuntimeWarning:
                # Caused by "RuntimeWarning: invalid value encountered in divide"
                # more than likely b/c the block is mostly quiet anyway. Write it as-is.
                of.write(block)
            except UserWarning:
                # With the check above, ending here means it's similar to the RuntimeWarning.
                # Therefore write the block as-is.
                of.write(block)

ap-atul / Audio-Denoising

[Question] Working with compressed sounds #4