Andrewcpu / elevenlabs-api

🗣️🎤 elevenlabs-api is an open source Java wrapper around the ElevenLabs Voice Synthesis and Cloning Web API.
https://github.com/AndrewCPU/elevenlabs-api
GNU General Public License v3.0
35 stars 9 forks source link
elevenlabs text-to-speech tts-api voice-cloning voice-generation

🗣️🔊 elevenlabs-api Build

An unofficial ElevenLabs AI Voice Generation Java API

Getting Started

So you wanna make custom voices, huh? Well you've come to the right place. This library should cover all the ElevenLabs API endpoints as of 11/15/23. Update It seems I jumped the gun (or happened to bump into an accidental push of the speech to speech API docs), but the original documentation that my implementation was based on has now been removed or hidden. It may reappear as is, or will require changes.

Installation

Maven

To add elevenlabs-api to your Maven project, use:

<dependency>
    <groupId>net.andrewcpu</groupId>
    <artifactId>elevenlabs-api</artifactId>
    <version>2.7.8</version>
</dependency>

JAR

Compiled JARs are available via the Releases tab

Setting up your API Key

To access your ElevenLabs API key, head to the official website, you can view your xi-api-key using the 'Profile' tab on the website. To set up your ElevenLabs API key, you must register it with the ElevenLabsAPI Java API like below:

ElevenLabs.setApiKey("YOUR_API_KEY_HERE");

ElevenLabs.setDefaultModel("eleven_monolingual_v1"); // Optional, defaults to: "eleven_monolingual_v1"

For any public repository security, you should store your API key in an environment variable, or external from your source code.


Links to ElevenLabs

ElevenLabs Website: https://elevenlabs.io

ElevenLabs API Documentation: https://api.elevenlabs.io/docs


Simplified Generation Handling with Builders

v2.7.8 now includes SpeechGenerationBuilder.java

Text to Speech

//File output
SpeechGenerationBuilder.textToSpeech()
        .file() // output type of file (or use .streamed() for an InputStream)
        .setText(String text)
        .setGeneratedAudioOutputFormat(GeneratedAudioOutputFormat.MP3_44100_128)
        .setVoiceId("voiceIdString")
        .setVoiceSettings(VoiceSettings)
        .setVoice(Voice) // or use a voice object, which will pull settings / ID out of the Voice
        .setModelId("modelIdString")
        .setModel(ElevenLabsVoiceModel.ELEVEN_ENGLISH_STS_V2)
        .setLatencyOptimization(StreamLatencyOptimization.NONE)
        .build();
//Streamed output
SpeechGenerationBuilder.textToSpeech()
        .streamed()
        .setText(String text)
        .setGeneratedAudioOutputFormat(GeneratedAudioOutputFormat.MP3_44100_128)
        .setVoiceId("voiceIdString")
        .setVoiceSettings(VoiceSettings)
        .setVoice(Voice) // or use a voice object, which will pull settings / ID out of the Voice
        .setModelId("modelIdString")
        .setModel(ElevenLabsVoiceModel.ELEVEN_ENGLISH_STS_V2)
        .setLatencyOptimization(StreamLatencyOptimization.NONE)
        .build();

Voices

Accessing your List of Available Voices

To retrieve your list of accessible Voices, you can statically utilize Voice#getVoices(). This will return both ElevenLab's pregenerated Voice models, as well as any personal Voices you have generated.

List<Voice> voices = Voice.getVoices();

Accessing the Default Voice Settings

ElevenLabs provides a default VoiceSettings configuration which can be accessed statically from VoiceSettings#getDefaultVoiceSettings() This is a network request.

VoiceSettings.getDefaultVoiceSettings();

Getting a Voice by ID

Retrieving voices via their voiceId can be done in a few different ways.

The first retrieves a Voice by voiceId and by default includes the settings.

Voice.get(String voiceId);

If you don't wish to retrieve the Voice model with its default settings included, you can use the Voice#get(String voiceId, boolean withSettings) function. By specifying false for withSettings, you will receive a voice object without its default VoiceSettings. (They can be loaded later with Voice#fetchSettings(), or not at all by providing a VoiceSettings object when generating TTS)

Voice.get(String voiceId, boolean withSettings);

Deleting a voice

To delete a Voice, you can utilize the Voice#delete() function. This will delete a voice from the ElevenLabs API.

Voice voice;
voice.delete();

Retrieving an Updated VoiceSettings for a Voice

There may be times when the default VoiceSettings parameters are changed externally from the API (Via the main website or another integrated system), to retrieve and apply the most up to date VoiceSettings object to a Voice, you can use the Voice#fetchSettings() function. (This is a network request, and it updates the object you're acting upon)

Voice voice;
voice.fetchSettings(); // requests updated settings from ElevenLabs

Updating the VoiceSettings for a Voice

A VoiceSettings object can be modified and updated in a Voice. The Voice#updateVoiceSettings(VoiceSettings settings) function updates the default voice parameters to use when generating speech. (This is a network request, and updates the object you're acting upon.)

Voice voice;
voice.updateVoiceSettings(VoiceSettings settings);

Editing a Voice

To edit an existing Voice model, you can load the Voice into a VoiceBuilder with the VoiceBuilder#fromVoice(Voice voice) function. The fromVoice(Voice voice) function will add all the current values stored in a Voice object directly into the VoiceBuilder. (Meaning you do not have to redefine existing settings)

You'll also see that the Creating a Voice section utilizes this same VoiceBuilder class, except it uses the VoiceBuilder#create() function, instead of the VoiceBuilder#edit() function.

When editing a voice, you must use VoiceBuilder.edit()

Voice voice;
VoiceBuilder builder = VoiceBuilder.fromVoice(voice); // load the existing voice into the builder
builder.withLabel("key", "value"); // add a new label
builder.removeLabel("oldKey");
builder.withFile(new File("someAudioFile.mp3")); // add a new audio sample
voice = builder.edit(); // edit voice & return updated voice object

Creating a Voice

To generate a new Voice model from the API, you can use the VoiceBuilder class to assemble the required parameters for a Voice model. Some things to remember:

Generating Audio (TTS + STS)

To generate an audio file with a given Voice, you can utilize the Voice#generate(...) functions. Depending on how you access your Voice, (with or without settings), will decide whether you can use the implicit voiceSettings or if you have to specify the VoiceSettings object to use. Unless explicitly requesting the Voice without settings, every Voice object SHOULD contain its default VoiceSettings.


Voice voice;

File file = voice.generate("Hello world!", "my_favorite_model");
...
//Available Functions:
public File generate(String text, String model);

public File generate(String text, String model, VoiceSettings settings);

public File generate(String text, String model, VoiceSettings settings, GeneratedAudioOutputFormat outputFormat, StreamLatencyOptimization streamLatencyOptimization);

public File generate(String text, String model, VoiceSettings settings, GeneratedAudioOutputFormat outputFormat);

public File generate(String text, VoiceSettings settings, GeneratedAudioOutputFormat outputFormat);

public File generate(String text, VoiceSettings settings, StreamLatencyOptimization streamLatencyOptimization);

public File generate(String text, VoiceSettings settings, GeneratedAudioOutputFormat outputFormat, StreamLatencyOptimization streamLatencyOptimization);

public File generate(String text, GeneratedAudioOutputFormat outputFormat, StreamLatencyOptimization streamLatencyOptimization);

public File generate(String text, StreamLatencyOptimization streamLatencyOptimization);

public File generate(String text, VoiceSettings settings);

public File generate(String text);

public InputStream generateStream(String text, String model);

public InputStream generateStream(String text, String model, VoiceSettings settings);

public InputStream generateStream(String text, VoiceSettings settings);

public InputStream generateStream(String text);

public InputStream generateStream(String text, String model, GeneratedAudioOutputFormat generatedAudioOutputFormat, StreamLatencyOptimization streamLatencyOptimization);

public InputStream generateStream(String text, String model, GeneratedAudioOutputFormat generatedAudioOutputFormat, StreamLatencyOptimization streamLatencyOptimization, VoiceSettings settings);

public InputStream generateStream(String text, VoiceSettings settings, GeneratedAudioOutputFormat generatedAudioOutputFormat, StreamLatencyOptimization streamLatencyOptimization);

public InputStream generateStream(String text, GeneratedAudioOutputFormat generatedAudioOutputFormat, StreamLatencyOptimization streamLatencyOptimization);

public InputStream generateStream(String text, String model, StreamLatencyOptimization streamLatencyOptimization);

public InputStream generateStream(String text, String model, StreamLatencyOptimization streamLatencyOptimization, VoiceSettings settings);

public InputStream generateStream(String text, VoiceSettings settings, StreamLatencyOptimization streamLatencyOptimization);

public InputStream generateStream(String text, StreamLatencyOptimization streamLatencyOptimization);

//: # ()

//: # ()

//: # ()

//: # ()

//: # ()


Audio Native Projects

Creating an Audio Native Project

You can create audio native projects using the AudioNative API.

CreateAudioEnabledProjectRequest request = new CreateAudioEnabledProjectRequest()
        .setName("Project name")
        .setImage("https://...com/img.png")
        .setAuthor("Andrew")
        .setSmall(true)
        .setTextColor("red")
        .setBackgroundColor("black")
        .setSessionization(3)
        .setVoiceId("aso23809")
        .setModelId("my_favorite_model")
        .setFile(new File("input.dat"))
        .setAutoConvert(true);

CreateAudioEnabledProjectModelResponse response = ElevenLabs.getAudioNativeAPI()
                                                             .createAudioEnabledProject(request);

Projects

Create a project

You can create a new project using the AddProjectRequest builder.

AddProjectRequest request = new AddProjectRequest()
                                .setName("name")
        .setFromUrl("...")
        .setFromDocument(new File("file.dat"))
        .setDefaultTitleVoiceId("voiceA")
        .setDefaultParagraphVoiceId("voiceB")
        .setDefaultModelId("the_default_model_of_your_dreams")
        .setProjectOutputQuality(ProjectOutputQuality.STANDARD)
        .setTitle("Big Title")
        .setAuthor("Best Author")
        .setIsbnNumber("THE. ISBN.")
        .setAcxVolumeNormalization(true);

Project project = Project.addProject(request);

Get a Project

Get a project by it's specific project ID.

Project project = Project.getProjectById(projectId);

Get all Projects

Get all of the projects associated with your account.

List<Project> projects = Project.getProjects();

Interacting with Projects

Project project;
// Delete a project
String deleteResult = project.delete();

// Convert a project
String conversionResult = project.convertProject();

//Get the project's snapshots.
List<ProjectSnapshot> snapshots = project.getSnapshots();

// Access your chapters from memory
List<Chapter> chapters = project.getChapters();

// Refresh your local Project's chapters from the API
chapters = project.fetchUpdatedChapters(); // This will update the existing project object and return the list of new chapters 

// Get a chapter by ID
Chapter chapter = project.getChapterById(chapterId);

// Delete a chapter
String result = project.deleteChapter(chapter);

// Convert a chapter
String result = project.convertChapter(chapter);

// Get chapter snapshots
List<ChapterSnapshot> snapshots = project.getChapterSnapshots(chapter);

Interacting with Chapters

Project project;
Chapter chapter = project.getChapterById("chapter_id");

// Delete a chapter
chapter.deleteChapter(project.getProjectId());

// Convert a chapter
chapter.convertChapter(project.getProjectId());

// Get a chapter's snapshots
List<ChapterSnapshot> snapshots = chapter.getChapterSnapshots(project.getProjectId());

// 

Accessing Snapshot Audio

Accessing a ProjectSnapshot audio stream:

Project project;
List<ProjectSnapshot> projectSnapshots = project.getSnapshots();
ProjectSnapshot first = projectSnapshots.get(0);
InputStream audio = first.getAudioStream();

Accessing a ChapterSnapshot audio stream:

Project project;
Chapter chapter;
List<ChapterSnapshot> chapterSnapshots = project.getChapterSnapshots(chapter);
ChapterSnapshot first = chapterSnapshots.get(0);
InputStream audio = first.getAudioStream();

Both ProjectSnapshot and ChapterSnapshot are of type Snapshot.


Samples

A Sample is used as the training data for a given Voice model.

Accessing Voice Samples

To access the sample(s) for a given Voice, you can utilize Voice#getSamples().

Voice voice;
List<Sample> samples = voice.getSamples();

Downloading a Sample

You can download a Sample via the Sample#downloadAudio(File outputFile) function. The File parameter of downloadAudio() is the location of where you want to locally download the sample.

Voice voice;
File file = voice.getSamples().get(0).downloadAudio();

Deleting a Sample

Deleting a Sample is easy. This action makes a network request.

Sample sample;
sample.delete();

History

Your ElevenLabs History is a collection of all of your previous TTS generations.

Getting Generation History

To get your ElevenLabs generation History, you can utilize History#get(). (You can also retrieve your History from a User object, with User#getHistory())

History history = History.get();

Getting all History Items

The History endpoint accepts page size parameters and a start-after-history-id parameter. We can use this to fetch all of our HistoryItems.

    History history = History.get(); // the latest history object
    Optional<History> hist = Optional.of(history);
    List<HistoryItem> items = new ArrayList();
    do {
        items.addAll(hist.get().getHistoryItems());
        hist = hist.get().next();
    } while(hist.isPresent() && hist.hasMore());

Getting a History Item

To retrieve a HistoryItem from your History, you can use History#getHistoryItem(String itemId).

History history;
HistoryItem item = history.getHistoryItem("itemId");

Downloading History

The official API of ElevenLabs provides an endpoint for downloading multiple HistoryItem's as a ZIP file. To download such items, you can pass a String[] containing the HistoryItem IDs, OR you can provide a List<HistoryItem> parameter.

History history;
File download = history.downloadHistory(new String[]{"item-id1", "item-id2"});
File download = history.downloadHistory(List<HistoryItem> historyItems);

Deleting a HistoryItem

You can utilize the HistoryItem#delete() function to delete a HistoryItem from ElevenLabs.

HistoryItem item;
item.delete();

Requesting the Voice for a HistoryItem

By default, a HistoryItem contains the voiceId of the Voice used to generate it. The getVoice() function will send a request to the ElevenLabs API to retrieve the voice object. (See also Voice.get() and Voice.getVoices())

HistoryItem item;
Voice voice = item.getVoice();

Downloading a HistoryItem Audio

A HistoryItem is a previous TTS generation. You can download the generation as an MP3 file by executing the downloadAudio() function. The return value is the tmp File location of your download.

HistoryItem item;
File file = item.downloadAudio();

Projects

The Projects


User Management

Getting your Subscription

A Subscription contains all the relevant data to manage your API usage (character usage, next billing cycle, etc.)

Subscription subscription = Subscription.get();

Getting your User

This endpoint will return the User associated with a given API key.

User user = User.get();

Exceptions

ElevenLabsValidationException

This error indicates a malformed request to the ElevenLabs API. The exception should provide the location of any syntactically incorrect parameters within the request.


Misc

As specified on the official ElevenLabs API Documentation, their API is experimental and all endpoints are subject to change. Depending on how they modify their API, may break this library. Should you notice any API changes / library errors, feel free to submit an issue or a PR.

If you like what you see, give it a star! :)


Todo

I will probably rework the 2 new builders I added when I added projects support. Their usage should be more clear, though the documentation covers their use cases I believe.


Unit Testing

Unit tests have been created, these endpoints are destructive and not included in the testing:

To run the unit tests yourself, you have to clone this repo and update the ElevenLabsTest.java file with your API key and your voice to test on.

Thanks to ElevenLabs for making an awesome tool 🥂