Open Bigfishering opened 1 day ago
We haven't conducted acceleration tests on desktop computers as our focus is on processing acceleration for Android mobile devices. However, some users have reported positive acceleration results using AMD-GPU with ONNXRuntime and DmlExecutionProvider. I believe similarly good results can be achieved with ORT using CUDAExecutionProvider, TensorRTExecutionProvider, or other inference engines. Because when we optimized the F5-TTS source code, we tended to use GPU-friendly processings for optimization.
have u tried to accelerate inference with TensorRT-LLM?or other inference engine of llm to accelerate it?