v2 does not work well with cosyvoice TTS

@xipingL Hello there! I'm a bot designed to help you with bugs, questions, and contributions. I'm here to assist you while waiting for a human maintainer.

I found a similar open issue: Converter doesn't work for Chinese?, which was last updated on August 03, 2024. The suggested solution includes:

Accent and Emotion: OpenVoice only clones the tone color of the reference speaker and does not clone the accent or emotion. The base speaker TTS model controls these aspects.
Quality of Reference Audio: Ensure the reference audio is clean, without background noise, and does not contain speech from more than one person.
Base Speaker Model: For multi-lingual and cross-lingual usage, ensure you have a base speaker model in the desired language. If not, consider using the OpenAI TTS model as the base speaker.

Additionally, ensure you are using the correct base speaker model for Chinese (zh_base_speaker_tts) and that the style parameter is set to 'default' for Chinese text inputs ^[1].

The issue with v2 not working well with CosyVoice TTS, specifically when using CosyVoice Chinese woman and man to generate audio and then using OpenVoice ToneColorConverter, is likely due to OpenVoice only cloning the tone color and not the accent or emotion. The base speaker models from MeloTTS, which OpenVoice relies on, may not perfectly align with CosyVoice TTS outputs ^[2]^[3]^[4].

To continue talking to Dosu, mention @dosu.

myshell-ai / OpenVoice

v2 does not work well with cosyvoice TTS #311