rany2 / edge-tts

Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
https://pypi.org/project/edge-tts/
GNU General Public License v3.0
4.24k stars 447 forks source link

UnboundLocalError: cannot access local variable 'audio_segment' where it is not associated with a value #162

Closed rajtilakjee closed 7 months ago

rajtilakjee commented 7 months ago

I have a function that used ElevenLabs API which worked fine. However, it incurred a lot of cost even during prototyping. So I thought it would be best to go for Edge-TTS. I have completed the conversation but am getting the error given in the title.

Here's the ElevenLabs API code:

from elevenlabs import generate, set_api_key, Voice, VoiceSettings
from pydub import AudioSegment
import io
import os
import hashlib
from utils.os_stuff import get_env_var_or_fail
import logging

set_api_key(get_env_var_or_fail('ELEVEN_LABS_API_KEY'))

HOST_VOICE = Voice(
    voice_id="21m00Tcm4TlvDq8ikWAM",
    name="Rachel",
    category="premade",
    settings=VoiceSettings(stability=0.35, similarity_boost=0.9),
)

ADS_VOICE = Voice(
    voice_id="TxGEqnHWrfWFTfGW9XjX",
    name="Josh",
    category="premade",
    settings=VoiceSettings(stability=0.35, similarity_boost=0.9),
)

def load_audio_bytes(audio_bytes):
    audio_file = io.BytesIO(audio_bytes)
    audio_segment = AudioSegment.from_file(audio_file, format='mp3')
    return audio_segment

def convert_text_to_mp3(text, voice):
    # Generate the cache key by MD5 hashing the text
    cache_key = hashlib.md5(text.encode()).hexdigest()

    # Check if the file already exists in cache
    cache_dir = ".eleven_labs_cache"
    cache_file = os.path.join(cache_dir, f"{cache_key}.mp3")

    if not os.path.exists(cache_dir):
        os.makedirs(cache_dir)

    if os.path.exists(cache_file):
        # If it does exist, load and return as an AudioSegment
        audio_segment = AudioSegment.from_mp3(cache_file)
    else:
        # If it does not exist, call the API, create, save, and return as an AudioSegment
        char_count = len(text)
        logging.info(f'Calling eleven labs for {char_count} chars...')
        section_1_voice_over = load_audio_bytes(generate(
            text=text,
            voice=voice
        ))
        section_1_voice_over.export(cache_file, format='mp3')
        audio_segment = section_1_voice_over

    return audio_segment

And here's the Edge-TTS code:

import asyncio
import edge_tts
from pydub import AudioSegment

VOICE = "en-GB-SoniaNeural"

def convert_text_to_mp3(text):
    loop = asyncio.get_event_loop_policy().get_event_loop()
    try:
        audio_segment = loop.run_until_complete(edge_tts.Communicate(text, VOICE))
    finally:
        loop.close()
        return audio_segment

Can someone please help me find a solution to this error.