myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell.
https://research.myshell.ai/open-voice
MIT License
29.89k stars 2.94k forks source link

Can I use VAD for OpenVoice2? #325

Open qzyyyy opened 1 month ago

qzyyyy commented 1 month ago

I want to clone a voice from audio segments that are shorter than 1.5 seconds after splitting, and I'm wondering if I can use VAD instead of Whisper in OpenVoice2.

dosubot[bot] commented 1 month ago

@qzyyyy Hello there! I'm here to assist you with any bugs, questions, or contributions while you wait for a human maintainer. Let's tackle this together!

Yes, you can use VAD (Voice Activity Detection) in OpenVoice2. The get_se function allows you to choose between VAD and Whisper for processing audio. By setting the vad parameter to True, the function will use VAD to segment the audio. However, keep in mind that OpenVoice2 filters out audio segments shorter than 1.5 seconds, so these segments will not be saved or used for voice cloning [1].

To continue talking to Dosu, mention @dosu.

qzyyyy commented 1 month ago

Will VAD effect the performance of OpenVoice2?