Open GalenReich opened 1 month ago
Maybe also check https://github.com/Vaibhavs10/insanely-fast-whisper) and its --timestamp {chunk,word}
parameter
Here is a notebook using it: https://github.com/Vaibhavs10/insanely-fast-whisper/blob/main/notebooks/infer_faster_whisper_large_v2.ipynb
That's very cool @belisards - thank you! word level timestamps probably isn't important for most research, so those sentence-level ones look really good!
Yep, I agree but look: even in its vanilla version, Whisper generates timestamps. Here it is in a notebook I created for a workshop last year: https://github.com/belisards/nlp_intro/blob/main/whisper.ipynb
Another feature that might be useful for open-source research is speech diarization. There is is a great video covering many Whisper variants and features like this: https://www.youtube.com/watch?v=Thc0vtnWYOo
A useful notebook (or addition to the existing whisper transcription notebook) would be one that enabled users to run whisper over audio for transcription, and then access produce word- or sentence-level timestamps (not part of whisper's functionality).
A package like https://github.com/linto-ai/whisper-timestamped might make this easy.