Hello! For Nvidia based workstations, is it possible to use Nvidia Optimum pipelines instead of Hugging Face default ones to gain speed in Whisper token generation? I have not tested it though. Here is the referenced article mentioning gains in LLaMA based models: https://huggingface.co/blog/optimum-nvidia and https://github.com/huggingface/optimum-nvidia
Hello! For Nvidia based workstations, is it possible to use Nvidia Optimum pipelines instead of Hugging Face default ones to gain speed in Whisper token generation? I have not tested it though. Here is the referenced article mentioning gains in LLaMA based models: https://huggingface.co/blog/optimum-nvidia and https://github.com/huggingface/optimum-nvidia