-
I just tried downloading the speaker embedding model from the link in the README: https://drive.google.com/file/d/1ORAeb4DlS_65WDkQN6LHx5dPyCM5PAVV/view?usp=sharing but the file does not seem to be av…
-
使用Instruct进行推理,希望固定音色输出音频,但现状会有所偏移,有概率出现男女混合的音频。
1. 使用Instruct模型推理,没有embedding,如何指定音色呢?
2. 若通过prompt_text描述音色,应该如何描述自定义的音色?
-
The pronunciation of Mandarin Chinese and Korean is not accurate, and the British accent is quite strong. How can the pronunciation of these two languages be improved?
-
type IsSpeaking
bool
type WhoIsSpeaking
uuid
known speakers
[chat on diarization embeddings](https://chatgpt.com/share/6704175b-9184-800f-bc01-2076a8af85bf)
[chat on running models locall…
-
HI, thank you for nice code. I wonder why the ignore layers in hparams.py contains speaker embedding.weight? Don't we restore speaker embedding?
-
Hi @keithito !
In the multispeaker branch, I wonder why the speaker_ids embedding size is set to 377 ? Is this the result of a particular tuning or is it related to the number of speakers in the corp…
-
Hi, I have the same question as https://github.com/microsoft/SpeechT5/issues/16#issuecomment-1516257038. My training dataset is Chinese, so can i use speechbrain/spkrec-xvect-voxceleb to extract speak…
-
I've added Toucan to the TTS Arena fork by using the MassivelyMultilingualTTS space.
Arena: https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena
TTS Space: https://huggingface.co/spaces/Flux9665…
-
many target speaker extraction is for single channel, multi-channel target speaker extraction is less researched. and many target speaker extraction network is time domain and performance is poor …
-
### Describe the bug
example code gives error when saving.
### To Reproduce
```
import os
import time
import torch
import torchaudio
from TTS.tts.configs.xtts_config import XttsConfig
…