DakeQQ / F5-TTS-ONNX

Running the F5-TTS by ONNX Runtime
40 stars 3 forks source link

TRT-LLM #2

Closed Bigfishering closed 1 week ago

Bigfishering commented 3 weeks ago

have u tried to accelerate inference with TensorRT-LLM?or other inference engine of llm to accelerate it?

DakeQQ commented 3 weeks ago

We haven't conducted acceleration tests on desktop computers as our focus is on processing acceleration for Android mobile devices. However, some users have reported positive acceleration results using AMD-GPU with ONNXRuntime and DmlExecutionProvider. I believe similarly good results can be achieved with ORT using CUDAExecutionProvider, TensorRTExecutionProvider, or other inference engines. Because when we optimized the F5-TTS source code, we tended to use GPU-friendly processings for optimization.