myshell-ai / OpenVoice

Instant voice cloning by MIT and MyShell.
https://research.myshell.ai/open-voice
MIT License
28.95k stars 2.82k forks source link

V2 如何 Voice Style Control ? #286

Open hwfzllhyq opened 2 months ago

hwfzllhyq commented 2 months ago

v1 可以选择 Available choices: friendly, cheerful, excited, sad, angry, terrified, shouting, whispering 但v2好像没有了?

dosubot[bot] commented 2 months ago

@hwfzllhyq 你好!我是Dosu。我是一个可以帮助你解决问题、回答你的疑问、成为贡献者的机器人。在等待人类维护者的期间,我会尽力帮助你。

在 V2 中,你可以通过在生成音频时指定所需的风格作为参数来控制 Voice Style。以下是一些示例,展示了如何使用不同的风格,如 'default'、'whispering' 和 'sad':

examples = [
    [
        "今天天气真好,我们一起出去吃饭吧。",
        'default',
        "resources/demo_speaker1.mp3",
        True,
    ],[
        "This audio is generated by open voice with a half-performance model.",
        'whispering',
        "resources/demo_speaker2.mp3",
        True,
    ],
    [
        "He hoped there would be stew for dinner, turnips and carrots and bruised potatoes and fat mutton pieces to be ladled out in thick, peppered, flour-fattened sauce.",
        'sad',
        "resources/demo_speaker0.mp3",
        True,
    ],
]

在这些示例中,每个列表的第二个元素是语音风格。你可以通过在生成音频时设置此参数来控制语音风格。

此外,你可以使用 BaseSpeakerTTS 类,通过在 tts 方法中设置 speaker 参数来控制语音风格。可用的选择包括 friendly、cheerful、excited、sad、angry、terrified、shouting、whispering。以下是一个示例:

# 初始化 TTS 和音色转换器
base_speaker_tts = BaseSpeakerTTS(f'{ckpt_base}/config.json', device=device)
base_speaker_tts.load_ckpt(f'{ckpt_base}/checkpoint.pth')
tone_color_converter = ToneColorConverter(f'{ckpt_converter}/config.json', device=device)
tone_color_converter.load_ckpt(f'{ckpt_converter}/checkpoint.pth')

# 设置源音色嵌入
source_se = torch.load(f'{ckpt_base}/en_style_se.pth').to(device)

# 使用所需风格生成音频
save_path = f'{output_dir}/output_whispering.wav'
text = "This audio is generated by OpenVoice."
src_path = f'{output_dir}/tmp.wav'
base_speaker_tts.tts(text, src_path, speaker='whispering', language='English', speed=0.9)

# 转换音色
encode_message = "@MyShell"
tone_color_converter.convert(
    audio_src_path=src_path, 
    src_se=source_se, 
    tgt_se=target_se, 
    output_path=save_path,
    message=encode_message)

通过调整 speaker 参数,你可以控制语音风格以匹配所需的输出[1][2][3]

To continue talking to Dosu, mention @dosu.

wanglongpeng1 commented 1 month ago

File "/datahdd/wyz/wlp/OpenVoice/part1.py", line 75, in base_speaker_tts.tts(text, src_path, speaker='whispering', language='Chinese', speed=1.0) File "/datahdd/wyz/wlp/OpenVoice/openvoice/api.py", line 85, in tts speaker_id = self.hps.speakers[speaker] File "/datahdd/wyz/wlp/OpenVoice/openvoice/utils.py", line 34, in getitem return getattr(self, key) AttributeError: 'HParams' object has no attribute 'whispering' 我用参数控制的时候 出现了这个报错