gpt.pt如何导出onnx?

2noise / ChatTTS

A generative speech model for daily dialogue.

https://2noise.com

GNU Affero General Public License v3.0

30.42k stars 3.3k forks source link

gpt.pt如何导出onnx? #618

Closed Baiyuetribe closed 11 hours ago

Baiyuetribe commented 1 month ago

其他模型导出很容易，就这个不会，希望能新增onnx导出

ZaymeShaw commented 1 month ago

同求方案

ZillaRU commented 1 month ago

可以分块导出，分成每个decoder layer、LM head、Embedding、sample head导出。

ZillaRU commented 1 month ago

https://github.com/2noise/ChatTTS/pull/622 之前写的导出脚本。 https://zhuanlan.zhihu.com/p/703240560

Baiyuetribe commented 1 month ago

@ZillaRU 哇，gpt竟然能分割导出10个以上的onnx文件，有点小困惑。真的不能再柔和一下吗？

ZaymeShaw commented 1 month ago

@ZillaRU 感谢分享思路。想问下是因为需要对应cpp里面的算子实现，所以才需要拆的这么细吗。如果只是往tensorRT方向加速的话，是不是可以适当做一些融合

ZillaRU commented 1 month ago

可以参考https://github.com/tpoisonooo/llama.onnx 做导出，chatTTS的gpt其实是一个小型的llama。拆的细是因为每个decoder layer的结构是等同的，单独拆开方便对单个block的优化和测试验证。可以认为优化了单个就是优化了全部。而且做量化的话这样好观察误差来源。

github-actions[bot] commented 11 hours ago

This issue was closed because it has been inactive for 15 days since being marked as stale.