-
Is it possible to add VALL-E X support? It allows you to clone English, Japanese, and Chinese voices (unlike Bark, it is supported by the developers) and doesn't add background music. The model is sma…
-
https://github.com/p0p4k/vits2_pytorch/blob/c4fb23c06fadf8a8fc49b57a0aa7ebdfe744bb0f/models.py#L847
Hi, just wonder this line. Seems that though the TextEncoder conditioned on speaker embedding on …
-
Hi.
First of all, thanks for your contributions on VITS2!
I was wondering if you'd have any tips how to train a different (English) voice than the ljspeech one. I don't have access to very power…
-
刚开始用大数据集的时候会和上一位题主一样出现除零错误,那个issues用换小数据集的方法解决后又出现了新的报错
**OSError: [WinError 1455] 页面文件太小,无法完成操作。 Error loading "D:\Bert-VITS2-Integration-Package\venv\lib\site-packages\torch\lib\cublas64_11.dll" or…
-
First of all, thank you for sharing such a wonderful code.
I trained using the KSS dataset for 490 epochs, but the quality is not as good as I expected.
It seems that the TTS speaks a bit fast.
[wa…
-
```
➜ Style-Bert-VITS2 on master python resample.py --in_dir Data\test\raw --out_dir Data\test\wavs --num_processes 12 --sr 44100
0%| …
-
INFO:OUTPUT_MODEL:====> Epoch: 1
G:\Bert-VITS2-Integration-Package\venv\lib\site-packages\torch\optim\lr_scheduler.py:138: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()…
-
Hello, I'm currently training a Korean multi-speaker model with 5 speakers.
The batch size is 50, with 8 hours of data that has a sampling rate of 22050.
However, I have no idea how to interpret the…
-
First thanks for this great repo.
I have a question.
Are you using this [viet-tts-dataset](https://huggingface.co/datasets/ntt123/viet-tts-dataset) ? If so, do you have the preprocessing code before…
-
Thank you for your hard work. I have a question while attempting to train with your code.
During the training of the duration predictor, I noticed that the "**loss_dur**" fluctuates significantly c…