-
**如果 12 句 finetune 效果不佳,一般是因为数据集太小了,建议增加数据集,一般是 300 ~ 600 条,数据量和质量越好,合成的效果越好**
数据的质量要求没有混响,没有杂音,离麦克风距离适中,具体可以参考标贝的数据质量。
finetune 出来的音色与 目标说话人和原始说话人的相似度有关,即目标说话人和原始说话人相似度越高,finetune 出来的音色更接近目标说话人。
f…
-
Hello Mr. Kumar,
I noticed that you set
```
self.inputFeatDim = 429 ## IMPORTANT: HARDCODED. Change if necessary.
```
I am wondering how can I check the inputFeatDim of my dataset?
Thank…
-
**Debugging checklist**
[ ] Have you updated to latest MFA version? Yes
[ ] Have you tried rerunning the command with the `--clean` flag? Yes
**Describe the issue**
A clear and concise descrip…
-
What would be the main steps for building a real-time decoder on top of EESEN?
I read in the EESEN paper that composing the tokens, lexicon and grammar speeds up decoding a great deal, and I'd li…
-
在进行音色克隆任务微调时,使用官方给的测试样例程序能够跑通,能够生成最终结果;但是上传自己录制的数据时报错:This dataset has no examples。
(paddlespeech-gpu) [root@int-gpu-001 tts3]$./run_mix.sh --stage 0 --stop-stage 3
check oov
get mfa result
align.…
-
Hi @radekosmulski , I figured I'd open a new issue to discuss the paper itself so we can keep using #6 for your updates only.
Forgive me in advance if some of my questions have already been discus…
-
Any idea?
[0409 18:16:28 @parallel.py:193] [MultiProcessPrefetchData] Will fork a dataflow more than one times. This assumes the datapoints are i.i.d.
[0409 18:16:28 @argtools.py:146] WRN "import …
-
For Reproducing your issue
Please fill out the following:
Corpus structure
What language is the corpus in? Mandarin
How many files/speakers? 4
Are you using lab files or TextGrid files for inpu…
-
Hi Cristinae,
(For speaker M01 from Torgo database)
I am actually running the dnn scripts, so I compute MFCCs -> LDA-> MLLT-> SAT , which gives me tri3b model in exp folder. Final three scripts befo…
-
**Describe the bug**
When using the Python bindings for Precise, I've noticed that the model predictions can vary substantially depending on where in the input audio the wake word is located. For exa…