myshell-ai / MeloTTS

High-quality multi-lingual text-to-speech library by Support English, Spanish, French, Chinese, Japanese and Korean.
MIT License
4.37k stars 549 forks source link

ONNX infer #164

Open pengpengtao opened 1 month ago

pengpengtao commented 1 month ago

Does this model support ONNX inference?

csukuangfj commented 1 month ago


I suggest that you have a look at

We support exporting MeloTTS models to onnx.

You can use to test your exported onnx model.

Furthermore, we also provide a C++ runtime for it and support 10 programming languages.

You can try the exported MeloTTS Chinese+English ONNX model on your Android phone by downloading the APK from


eehoeskrap commented 1 month ago

@csukuangfj Good news! Are there any plans for a Korean Onnx version?

csukuangfj commented 1 month ago

You can have a look at how we convert the Chinese+English model. The way for converting Korean should be similar.

csukuangfj commented 1 month ago

We already have a Korean tts model at So we don't plan to convert the Korean model from MeloTTS soon.

eehoeskrap commented 1 month ago

@csukuangfj I will analyze the Chinese + English version and attempt to perform inference on the Korean ONNX version. Thank you!

m-bain commented 1 month ago

@eehoeskrap any success?

khacpv commented 2 weeks ago

@csukuangfj I tried to convert JP language by edit from ZH to JP but not successful.

Could you help me check? Thank you.

output file: test.wav seem not working.

The build logs as following:

~/user/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/nn/utils/ UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  t_s == t_t
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  pad_length = max(length - (self.window_size + 1), 0)
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  slice_start_position = max((self.window_size + 1) - length, 0)
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if pad_length > 0:
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if torch.min(inputs) < left or torch.max(inputs) > right:
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if min_bin_width * num_bins > 1.0:
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if min_bin_height * num_bins > 1.0:
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert (discriminant >= 0).all()
~/user/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/onnx/_internal/ UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/jit/passes/onnx/constant_fold.cpp:181.)
  _C._jit_pass_onnx_node_shape_type_inference(node, params_dict, opset_version)
~/user/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/onnx/ UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  return g.op("Constant", value_t=torch.tensor(list_or_value))
~/user/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/onnx/ UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/jit/passes/onnx/constant_fold.cpp:181.)
~/user/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/onnx/ UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/jit/passes/onnx/constant_fold.cpp:181.)
csukuangfj commented 2 weeks ago

@csukuangfj I tried to convert JP language by edit from ZH to JP but not successful.

Could you help me check? Thank you.

output file: test.wav seem not working.

The build logs as following:

~/user/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/nn/utils/ UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  t_s == t_t
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  pad_length = max(length - (self.window_size + 1), 0)
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  slice_start_position = max((self.window_size + 1) - length, 0)
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if pad_length > 0:
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if torch.min(inputs) < left or torch.max(inputs) > right:
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if min_bin_width * num_bins > 1.0:
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if min_bin_height * num_bins > 1.0:
/tmp/MeloTTS/melo/ TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert (discriminant >= 0).all()
~/user/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/onnx/_internal/ UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/jit/passes/onnx/constant_fold.cpp:181.)
  _C._jit_pass_onnx_node_shape_type_inference(node, params_dict, opset_version)
~/user/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/onnx/ UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  return g.op("Constant", value_t=torch.tensor(list_or_value))
~/user/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/onnx/ UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/jit/passes/onnx/constant_fold.cpp:181.)
~/user/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/onnx/ UserWarning: Constant folding - Only steps=1 can be constant folded for opset >= 10 onnx::Slice op. Constant folding not applied. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/torch/csrc/jit/passes/onnx/constant_fold.cpp:181.)

Sorry, the info you give is toooo limited.

csukuangfj commented 2 weeks ago

Please see

-rw-r--r-- 1 runner docker 163M Aug 24 09:13 model.onnx

You can see that by changing to JP you can successfully convert the model to onnx.

(You need to handle tokens.txt and lexicon.txt for Japanese).

khacpv commented 2 weeks ago

@csukuangfj I can run and get model.onnx as following:

➜  melo-tts git:(main) ✗ ls -la 
total 347920
drwxr-xr-x  12 phamkhac  staff        384 Aug 24 14:59 .
drwxr-xr-x  27 phamkhac  staff        864 Aug 22 09:02 ..
-rw-r--r--@  1 phamkhac  staff       6148 Aug 24 14:59 .DS_Store
-rw-r--r--   1 phamkhac  staff        156 Aug 21 09:34
-rwxr-xr-x   1 phamkhac  staff       8731 Aug 24 15:13
-rw-r--r--@  1 phamkhac  staff    6837671 Aug 24 15:20 lexicon.txt
-rw-r--r--   1 phamkhac  staff  170604200 Aug 24 15:21 model.onnx
-rwxr-xr-x   1 phamkhac  staff        614 Aug 24 14:50
-rwxr-xr-x   1 phamkhac  staff       1637 Aug 21 09:34
-rwxr-xr-x   1 phamkhac  staff       5196 Aug 24 15:18
-rw-r--r--   1 phamkhac  staff      70700 Aug 24 15:28 test.wav
-rw-r--r--@  1 phamkhac  staff       1440 Aug 24 15:20 tokens.txt

You need to handle tokens.txt and lexicon.txt for Japanese

Do you have any instruction for handle this files?

csukuangfj commented 2 weeks ago

You have to figure them out by yourself. We have already provided you an example for Chinese+English.