espnet / espnet_onnx

Onnx wrapper for espnet infrernce model
MIT License
151 stars 24 forks source link

wav quality drop #68

Open 1nlplearner opened 1 year ago

1nlplearner commented 1 year ago

hi, I initial text2speech using my own am_model and vocoder and export onnx model, but sound quality drops significantly, I just modify hifigan inference code in https://github.com/Masao-Someki/espnet_onnx/blob/feature/add_PWGVocoder/espnet_onnx/export/tts/models/vocoders/parallel_wavegan.py because hifigan code in repo ParalleWaveGAN does not support parameter x, and i checked Espnet am and vocoder and onnx am and vocoder, they look the same could you please offer some advises?

1nlplearner commented 1 year ago

hi, I initial text2speech using my own am_model and vocoder and export onnx model, but sound quality drops significantly, I just modify hifigan inference code in https://github.com/Masao-Someki/espnet_onnx/blob/feature/add_PWGVocoder/espnet_onnx/export/tts/models/vocoders/parallel_wavegan.py because hifigan code in repo ParalleWaveGAN does not support parameter x, and i checked Espnet am and vocoder and onnx am and vocoder, they look the same could you please offer some advises?

when i delete postprocess code in https://github.com/Masao-Someki/espnet_onnx/blob/master/espnet_onnx/tts/tts_model.py ,model can synthesis voice as pytorch inferencing

Masao-Someki commented 1 year ago

@1nlplearner Thank you for reporting this issue.

when i delete postprocess code in https://github.com/Masao-Someki/espnet_onnx/blob/master/espnet_onnx/tts/tts_model.py ,model can synthesis voice as pytorch inferencing

It seems that the normalization process causes this issue. Would you check your config file in ~/.cache/espnet_onnx/<tag_name>/config.yml, and check if the use_normalize is set to False? I think setting the use_normalize: false will fix this problem.