Open wincing2 opened 2 months ago
The python inference code provided seems the same as "normal" whisper. So where is the speedup coming from? Flash attention?
The python inference code provided seems the same as "normal" whisper. So where is the speedup coming from? Flash attention?