Open csukuangfj opened 1 year ago
Hi @csukuangfj, you can link to https://www.dropbox.com/scl/fi/l79c7eqb1mcz40dcv1p38/G_1128000.pth?rlkey=atb1aceydcp959z6ajrhcp8gt&dl=0 to get vits pretrained model for japanese
Hi @csukuangfj, you can link to https://www.dropbox.com/scl/fi/l79c7eqb1mcz40dcv1p38/G_1128000.pth?rlkey=atb1aceydcp959z6ajrhcp8gt&dl=0 to get vits pretrained model for japanese
@QuyAnh2005
Thank you for your quick response.
Is the given pre-traiend model compatible with the following config? https://github.com/QuyAnh2005/vits-japanese/blob/main/configs/jp_base.json
Yes, it is compatible @csukuangfj
Yes, it is compatible @csukuangfj
Thanks a lot!
This repo is using https://pypi.org/project/unidic-lite/#files, which is 248 MB after installation. The dict size is too large.
Is there a plan to use https://github.com/espeak-ng/espeak-ng
@csukuangfj maybe use, but I think that we need to train model again to get new pretrained weights
@csukuangfj maybe use, but I think that we need to train model again to get new pretrained weights
That would be great!
espeak-ng is used in piper and we have converted all VITS models from piper to sherpa-onnx. The models are available at https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models
If the dict size is 248 MB, I think it is too large for embedded devices and mobile devices.
If you can provide a Japanese vits model using espeak-ng, I can provide a runtime for it that supports android/ios/raspberry Pi, etc.
Hi @csukuangfj New requirements for inference phase only include
torch==2.0.0
scipy==1.10.1
mecab-python3
unidic-lite
pykakasi
librosa==0.8.0
monotonic-align==1.0.0
and unidic-lite
takes about 48MB. Can you convert this model into sherpa-onnx?
Thank you for making the code about VITS for Japanese open.
Could you also release the pre-trained models? I would like to provide a C++ runtime based on onnxruntime for it.
We have already supported all vits models from piper (https://huggingface.co/spaces/k2-fsa/text-to-speech) However, there are no Japanese models from piper. Would be great if you could provide a pre-trained model for Japanese.