Open anirpipi opened 1 year ago
Hi @anirpipi, sorry for the late reply, and thank you for reporting the issue. It may be a bug, so I would like to check this problem. It seems you are using your own trained model, can you confirm that this issue still happens with the published models? If it's reproducible, I will download the model and investigate this.
Hi..Thanks for the response. Its the same case with pre-trained models also.. For VITS, its fine but for FastSpeech2+PWG, the problem occurs.. Can you please look into it once Thanks in advance
Hi.. I am trying to convert pretrained LJSpeech TTS model based on _kan-bayashi/ljspeechfastspeech2 and _parallel_wavegan/ljspeech_parallelwavegan.v1 using the below code:
########################### ONNX Conversion ############################
from espnet2.bin.tts_inference import Text2Speech from espnet_onnx.export import TTSModelExport
m = TTSModelExport()
tag_exp = "exp/tts_train_fastspeech2_raw_phn_tacotron_g2p_en_no_space/train.loss.ave_5best.pth" train_config="exp/tts_train_fastspeech2_raw_phn_tacotron_g2p_en_no_space/config.yaml"
vocoder_tag = 'parallel_wavegan.v1/checkpoint-400000steps.pkl' vocoder_config= 'parallel_wavegan.v1/config.yml'
text2speech = Text2Speech.from_pretrained( train_config=train_config, model_file=tag_exp, vocoder_file=vocoder_tag, vocoder_config=vocoder_config, speed_control_alpha=1.0, always_fix_seed=False )
tag_name = 'ljspeech_pretrained' m.export(text2speech, tag_name, quantize=True)
########################### Inference ############################
from espnet_onnx import Text2Speech import soundfile import numpy as np import time
text2speech = Text2Speech(tag_name)
text = 'hello world!' wav = wav['wav']
soundfile.write("ljspeech_pretrained_test.wav", wav, 22050, "PCM_16")
######################################################################
On synthesizing, the audio quality is very low. I realized that the converted ONNX folder did not have stats.h5 file from the pwg vocoder folder. _~/.cache/espnet_onnx/ljspeesch_pretrained/: config.yaml featsstats.npz full quantize
Can anyone please help how to include the stats.h5 during inference using espnet_onnx