ManimCommunity / manim-voiceover

Manim plugin for all things voiceover
https://voiceover.manim.community/en/stable
MIT License
154 stars 20 forks source link

Add Elevenlabs service #83

Closed mohit2152sharma closed 4 months ago

mohit2152sharma commented 4 months ago

Added support for elevenlabs, with better parameter support. Have the option to select voice based on voice_name or voice_id. Also change the settings of voice using voice_settings parameter.

Remaining tasks:

mohit2152sharma commented 4 months ago

@osolmaz , have you had a chance to review the PR? If any changes, let me know...

osolmaz commented 4 months ago

Thank you!

I did some minor refactors.

I notice some issues that might be related to the library itself, like cached audio files are not used and are regenerated every run. This is not good since the API is not free. Will try to resolve those now.

In the meanwhile, can you add documentation? Like I saw this, but there was no such section on that page:

Check out https://voiceover.manim.community/en/stable/services.html#elevenlabs to learn how to create an account and get your subscription key.
osolmaz commented 4 months ago

@mohit2152sharma I improved caching behavior and enabled transcription with Whisper by default. This is necessary to use bookmarks.

Can you check whether Elevenlabs API returns word boundaries (timestamps for beginning of each word in the audio)? I looked briefly and couldn't see it, but I feel like it might be hidden somewhere.

mohit2152sharma commented 4 months ago

@osolmaz , I added the documentation. For the bookmark part, it wasn't working for me until i changed "voice_settings": self.voice.model_dump(exclude_none=True). (I tested it with examples/bookmark-example.py setting ElevenlabsService, the animation did get triggered at mentioned bookmarks)

Regarding caching I assumed that it was a bug as it wasn't respecting the --disable_caching flag. But after using elvenlabs for couple of days, I have realised now that it was a good behaviour. Thanks for reverting it.

osolmaz commented 4 months ago

I just ran the bookmark example using the default voice, the quality out of the box is insane.

You could get to something very reasonable with a little tweaking.

https://github.com/ManimCommunity/manim-voiceover/assets/2453968/9422daed-a3c6-4e12-a0c1-2b3dd711704f

osolmaz commented 4 months ago

Btw --disable_caching is necesssary because of a Manim bug with adding sound.