Closed johnpaulbin closed 1 year ago
To use:
from uberduck_ml_dev.utils.denoiser import Denoiser
# Example tacotron2 output
output = taco.inference(text_padded, input_lengths, speakerembedding, embedding)
# Replace HIFIGANGENERATOR with the variable to your vocoder generator
denoiser = Denoiser(HIFIGANGENERATOR, mode="normal") # Experiment with modes "normal" and "zeros"
# Inference Vocoder using forward
# Replace HIFIGANGENERATOR with the variable to your vocoder generator
audio = HIFIGANGENERATOR.vocoder.forward(output["mel_outputs_postnet"][:1])
audio = audio.squeeze()
audio = audio * 32768.0
# Denoise
audio_denoised = denoiser(audio.view(1, -1), strength=15)[:, 0] # Change strength if needed
audio_denoised = audio_denoised.cpu().detach().numpy().reshape(-1) # Convert tensors to audio
# Play audio
display(Audio(audio_denoised, rate=22050))
good to go?
Removes bias and allows for cleaner audio (less artifacting) with avocodo and hifigan support.