This allows to use Transformers Whisper checkpoints with TRT-LLM inference.
Left to do:
[x] Punctuation sometimes does not match, see https://huggingface.slack.com/archives/C0435BHN1L6/p1712160819124509, have to see if this is a bug or just a numerical artifact (does not impact WER calculated without punctuation), e.g. test_generation[None-openai/whisper-large-v3]
[ ] quality workflow somehow only fails on GA but not locally
As per title - tested in fp16 for now.
This allows to use Transformers Whisper checkpoints with TRT-LLM inference.
Left to do:
test_generation[None-openai/whisper-large-v3]