Possible way to nomalize the audio of the final output?

deama commented 1 year ago

Using the cloned voices, I'm having trouble normalizing the audio of them, e.g. one voice has a specific volume, another voice another volume, and I need a way to normalize it all so I don't have to adjust the volume for each one, if posssible.

lugia19 commented 1 year ago

This isn't something I'd be implementing in the module itself given that it's just meant as a way to interface with the API, but you can achieve it with pydub:

from pydub import AudioSegment

target_dBFS = -20.0

def normalize_audio(input_file, output_file):
    audio = AudioSegment.from_file(input_file)

    # Normalize the audio by applying the dBFS difference
    normalized_audio = audio.apply_gain(target_dBFS - audio.dBFS)

    # Export the normalized audio file
    normalized_audio.export(output_file, format="mp3")

There is no way to control the volume of the voices through the API, the only way is to do postprocessing on the generated audio files.

deama commented 1 year ago

Oh ok, thanks, I'll try it out.

lugia19 / elevenlabslib

Possible way to nomalize the audio of the final output? #15