elevenlabs / elevenlabs-python

The official Python API for ElevenLabs Text to Speech.
https://elevenlabs.io/docs/api-reference/getting-started
MIT License
2.09k stars 239 forks source link

[Text To Speech] No output #339

Open davidchapuis opened 1 month ago

davidchapuis commented 1 month ago

Hi folks When I try to run the code below from the official documentation, I don't get any output and I don't get error messages either ( with elevenlabs 1.5.0; With elevenlabs 1.6.1 I get the pydantic warning reported here https://github.com/elevenlabs/elevenlabs-python/issues/334 ) Am I missing something? I expect an audio file in the project folder as output, is it what I should expect?

from elevenlabs import VoiceSettings
from elevenlabs.client import ElevenLabs

client = ElevenLabs(
    api_key="e02e2adac715d25d280b862ea91b9bb0",
)
client.text_to_speech.convert(
    voice_id="pMsXgVXv3BLzUgSXRplE",
    optimize_streaming_latency="0",
    output_format="mp3_22050_32",
    text="It sure does, Jackie… My mama always said: In Carolina, the air's so thick you can wear it!",
    voice_settings=VoiceSettings(
        stability=0.1,
        similarity_boost=0.3,
        style=0.2,),
)
dsinghvi commented 1 month ago

@davidchapuis that will return a byte stream that youll then need to write to file on your own

davidchapuis commented 1 month ago

@davidchapuis that will return a byte stream that youll then need to write to file on your own

Got it, thanks

Actually just could make it work using requests and this other code snippet from official documentation:


# Import necessary libraries
import requests  # Used for making HTTP requests
import json  # Used for working with JSON data

# Define constants for the script
CHUNK_SIZE = 1024  # Size of chunks to read/write at a time
XI_API_KEY = "<xi-api-key>"  # Your API key for authentication
VOICE_ID = "<voice-id>"  # ID of the voice model to use
TEXT_TO_SPEAK = "<text>"  # Text you want to convert to speech
OUTPUT_PATH = "output.mp3"  # Path to save the output audio file

# Construct the URL for the Text-to-Speech API request
tts_url = f"https://api.elevenlabs.io/v1/text-to-speech/{VOICE_ID}/stream"

# Set up headers for the API request, including the API key for authentication
headers = {
    "Accept": "application/json",
    "xi-api-key": XI_API_KEY
}

# Set up the data payload for the API request, including the text and voice settings
data = {
    "text": TEXT_TO_SPEAK,
    "model_id": "eleven_multilingual_v2",
    "voice_settings": {
        "stability": 0.5,
        "similarity_boost": 0.8,
        "style": 0.0,
        "use_speaker_boost": True
    }
}

# Make the POST request to the TTS API with headers and data, enabling streaming response
response = requests.post(tts_url, headers=headers, json=data, stream=True)

# Check if the request was successful
if response.ok:
    # Open the output file in write-binary mode
    with open(OUTPUT_PATH, "wb") as f:
        # Read the response in chunks and write to the file
        for chunk in response.iter_content(chunk_size=CHUNK_SIZE):
            f.write(chunk)
    # Inform the user of success
    print("Audio stream saved successfully.")
else:
    # Print the error message if the request was not successful
    print(response.text)