xiaomingnio / kantts

TTS appalication based on modelscope KAN-TTS
43 stars 6 forks source link

本地已经使用官方代码训练了自己的说话人,这种的支持加速吗 #5

Closed bobo-paopao closed 10 months ago

bobo-paopao commented 10 months ago

本地已经使用官方代码训练了自己的说话人,这种的支持加速吗

xiaomingnio commented 10 months ago

模型结构是一样的,应该是支持的,微调的模型路径按照damo的方式重新组织下。

bobo-paopao commented 10 months ago

是的,我这边已经测试过了,是可以支持的,感谢您的分享~

xinzhuang commented 10 months ago

是的,我这边已经测试过了,是可以支持的,感谢您的分享~

您好,我也在尝试基于作者的代码对个性化微调模型speech_personal_sambert-hifigan_nsf_tts_zh-cn_pretrain_16k进行转换和推理,但是遇到了报错, [TRT] [E] 3: (Unnamed Layer* 10) [Fully Connected]:kernel weights has count 57344 but 36864 was expected [TRT] [E] 3: (Unnamed Layer* 10) [Fully Connected]:kernel weights has count 57344 but 36864 was expected [TRT] [E] 3: (Unnamed Layer* 83) [Constant]:constant weights has count 81920 but 40960 was expected 应该是尺寸不匹配导致的,个性化微调时,MelPNCADecoder输入尺寸应该为82,不是默认的80,并且还有NSF参数的处理,请问您遇到这些问题了嘛,有调整建议嘛,谢谢哈

xinzhuang commented 10 months ago

是的,我这边已经测试过了,是可以支持的,感谢您的分享~

您好,我也在尝试基于作者的代码对个性化微调模型speech_personal_sambert-hifigan_nsf_tts_zh-cn_pretrain_16k进行转换和推理,但是遇到了报错, [TRT] [E] 3: (Unnamed Layer* 10) [Fully Connected]:kernel weights has count 57344 but 36864 was expected [TRT] [E] 3: (Unnamed Layer* 10) [Fully Connected]:kernel weights has count 57344 but 36864 was expected [TRT] [E] 3: (Unnamed Layer* 83) [Constant]:constant weights has count 81920 but 40960 was expected 应该是尺寸不匹配导致的,个性化微调时,MelPNCADecoder输入尺寸应该为82,不是默认的80,并且还有NSF参数的处理,请问您遇到这些问题了嘛,有调整建议嘛,谢谢哈

解决了,按照配置文件修改了d_mem=320,in_units=82等参数可以了,谢谢啦

xinzhuang commented 10 months ago

@bobo-paopao 您好,针对个性化语音合成模型,hifigan 转换onnx 是可以的,但是 build成tensorRT engine 报错,解析onnx 时 报错


[E] ModelImporter.cpp:727: --- Begin node ---
[E] ModelImporter.cpp:728: input: "/source_module/ConstantOfShape_25_output_0"
    output: "/source_module/RandomUniformLike_output_0"
    name: "/source_module/RandomUniformLike"
    op_type: "RandomUniformLike"
    attribute {
      name: "dtype"
      i: 1
      type: INT
    }
[E] ModelImporter.cpp:729: --- End node ---
[E] ModelImporter.cpp:732: ERROR: builtin_op_importers.cpp:3462 In function importRandomUniformLike:
    [8] Assertion failed: (inputs.at(0).is_tensor()) && "The input tensor cannot be an initializer."
[E] In node 246 (importRandomUniformLike): UNSUPPORTED_NODE: Assertion failed: (inputs.at(0).is_tensor()) && "The input tensor cannot be an initializer."
[!] Could not parse ONNX correctly```
经过定位的话,是因为个性化模型开启了NSF,会加载source_module,但在作者给的代码中这部分用不到所以注释掉了,所以没有问题。[[code](https://github.com/xiaomingnio/kantts/blob/744a783036912132c9624fd4e123cd97d1fccd10/mykantts/models/hifigan/hifigan.py#L119C24-L119C24)]
请问遇见过吗,谢谢

环境配置信息:TensorRT=8.5.3.1 onnx=1.15.0
bobo-paopao commented 10 months ago

@bobo-paopao 您好,针对个性化语音合成模型,hifigan 转换onnx 是可以的,但是 build成tensorRT engine 报错,解析onnx 时 报错

[E] ModelImporter.cpp:727: --- Begin node ---
[E] ModelImporter.cpp:728: input: "/source_module/ConstantOfShape_25_output_0"
    output: "/source_module/RandomUniformLike_output_0"
    name: "/source_module/RandomUniformLike"
    op_type: "RandomUniformLike"
    attribute {
      name: "dtype"
      i: 1
      type: INT
    }
[E] ModelImporter.cpp:729: --- End node ---
[E] ModelImporter.cpp:732: ERROR: builtin_op_importers.cpp:3462 In function importRandomUniformLike:
    [8] Assertion failed: (inputs.at(0).is_tensor()) && "The input tensor cannot be an initializer."
[E] In node 246 (importRandomUniformLike): UNSUPPORTED_NODE: Assertion failed: (inputs.at(0).is_tensor()) && "The input tensor cannot be an initializer."
[!] Could not parse ONNX correctly```
经过定位的话,是因为个性化模型开启了NSF,会加载source_module,但在作者给的代码中这部分用不到所以注释掉了,所以没有问题。[[code](https://github.com/xiaomingnio/kantts/blob/744a783036912132c9624fd4e123cd97d1fccd10/mykantts/models/hifigan/hifigan.py#L119C24-L119C24)]
请问遇见过吗,谢谢

环境配置信息:TensorRT=8.5.3.1 onnx=1.15.0

我这边本地是使用的 speech_sambert-hifigan_tts_zh-cn_multisp_pretrain_16k 这个官方模型进行微调的,之后再通过作者的代码进行转换的,没有尝试过个性化那个 模型,但是我认为模型结构是一样的应该是可以通用的。ps:我本地的环境配置信息为tensorrt 8.5.1.7 onnx 1.14.1

xinzhuang commented 10 months ago

@bobo-paopao 您好,针对个性化语音合成模型,hifigan 转换onnx 是可以的,但是 build成tensorRT engine 报错,解析onnx 时 报错

[E] ModelImporter.cpp:727: --- Begin node ---
[E] ModelImporter.cpp:728: input: "/source_module/ConstantOfShape_25_output_0"
    output: "/source_module/RandomUniformLike_output_0"
    name: "/source_module/RandomUniformLike"
    op_type: "RandomUniformLike"
    attribute {
      name: "dtype"
      i: 1
      type: INT
    }
[E] ModelImporter.cpp:729: --- End node ---
[E] ModelImporter.cpp:732: ERROR: builtin_op_importers.cpp:3462 In function importRandomUniformLike:
    [8] Assertion failed: (inputs.at(0).is_tensor()) && "The input tensor cannot be an initializer."
[E] In node 246 (importRandomUniformLike): UNSUPPORTED_NODE: Assertion failed: (inputs.at(0).is_tensor()) && "The input tensor cannot be an initializer."
[!] Could not parse ONNX correctly```
经过定位的话,是因为个性化模型开启了NSF,会加载source_module,但在作者给的代码中这部分用不到所以注释掉了,所以没有问题。[[code](https://github.com/xiaomingnio/kantts/blob/744a783036912132c9624fd4e123cd97d1fccd10/mykantts/models/hifigan/hifigan.py#L119C24-L119C24)]
请问遇见过吗,谢谢

环境配置信息:TensorRT=8.5.3.1 onnx=1.15.0

我这边本地是使用的 speech_sambert-hifigan_tts_zh-cn_multisp_pretrain_16k 这个官方模型进行微调的,之后再通过作者的代码进行转换的,没有尝试过个性化那个 模型,但是我认为模型结构是一样的应该是可以通用的。ps:我本地的环境配置信息为tensorrt 8.5.1.7 onnx 1.14.1

嗯呢,我在使用的是speech_personal_sambert-hifigan_nsf_tts_zh-cn_pretrain_16k,整体结构是通用的,只是在是否使用NSF的时候有区别: speech_personal_sambert-hifigan_nsf_tts_zh-cn_pretrain_16k: 配置文件 包括:

nsf_params: {nb_harmonics: 7, nsf_f0_global_maximum: 730.0, nsf_f0_global_minimum: 30.0,
        nsf_norm_type: none, sampling_rate: 16000}

speech_sambert-hifigan_tts_zh-cn_multisp_pretrain_16k:配置文件 不包括NSF参数;

刚好在处理这块代码逻辑的时候,转换TensorRT engine 报错了。 我再研究下。谢谢您。

dengyao1 commented 1 month ago

请问大佬们解决了有NSF参数模型的转换了吗

Chengyang852 commented 2 weeks ago

请问解决了吗