Closed mohit2152sharma closed 4 months ago
@osolmaz , have you had a chance to review the PR? If any changes, let me know...
Thank you!
I did some minor refactors.
I notice some issues that might be related to the library itself, like cached audio files are not used and are regenerated every run. This is not good since the API is not free. Will try to resolve those now.
In the meanwhile, can you add documentation? Like I saw this, but there was no such section on that page:
Check out https://voiceover.manim.community/en/stable/services.html#elevenlabs to learn how to create an account and get your subscription key.
@mohit2152sharma I improved caching behavior and enabled transcription with Whisper by default. This is necessary to use bookmarks.
Can you check whether Elevenlabs API returns word boundaries (timestamps for beginning of each word in the audio)? I looked briefly and couldn't see it, but I feel like it might be hidden somewhere.
@osolmaz , I added the documentation. For the bookmark part, it wasn't working for me until i changed "voice_settings": self.voice.model_dump(exclude_none=True)
. (I tested it with examples/bookmark-example.py
setting ElevenlabsService
, the animation did get triggered at mentioned bookmarks)
Regarding caching I assumed that it was a bug as it wasn't respecting the --disable_caching
flag. But after using elvenlabs for couple of days, I have realised now that it was a good behaviour. Thanks for reverting it.
I just ran the bookmark example using the default voice, the quality out of the box is insane.
You could get to something very reasonable with a little tweaking.
Btw --disable_caching
is necesssary because of a Manim bug with adding sound.
Added support for elevenlabs, with better parameter support. Have the option to select voice based on
voice_name
orvoice_id
. Also change the settings of voice usingvoice_settings
parameter.Remaining tasks:
ElevenLabsService
to https://github.com/ManimCommunity/manim-voiceover/blob/main/docs/source/services.rst