DakeQQ / F5-TTS-ONNX

Running the F5-TTS by ONNX Runtime
17 stars 2 forks source link

TRT-LLM #2

Open Bigfishering opened 1 day ago

Bigfishering commented 1 day ago

have u tried to accelerate inference with TensorRT-LLM?or other inference engine of llm to accelerate it?

DakeQQ commented 12 hours ago

We haven't conducted acceleration tests on desktop computers as our focus is on processing acceleration for Android mobile devices. However, some users have reported positive acceleration results using AMD-GPU with ONNXRuntime and DmlExecutionProvider. I believe similarly good results can be achieved with ORT using CUDAExecutionProvider, TensorRTExecutionProvider, or other inference engines. Because when we optimized the F5-TTS source code, we tended to use GPU-friendly processings for optimization.