bellingcat / open-source-research-notebooks

Jupyter notebooks helping open source researchers, journalists, and fact-checkers use command line tools and code projects for digital investigations.
MIT License
200 stars 16 forks source link

Add notebook for audio/video transcription with timestamps #19

Open GalenReich opened 1 month ago

GalenReich commented 1 month ago

A useful notebook (or addition to the existing whisper transcription notebook) would be one that enabled users to run whisper over audio for transcription, and then access produce word- or sentence-level timestamps (not part of whisper's functionality).

A package like https://github.com/linto-ai/whisper-timestamped might make this easy.

belisards commented 1 month ago

Maybe also check https://github.com/Vaibhavs10/insanely-fast-whisper) and its --timestamp {chunk,word} parameter

Here is a notebook using it: https://github.com/Vaibhavs10/insanely-fast-whisper/blob/main/notebooks/infer_faster_whisper_large_v2.ipynb

GalenReich commented 1 month ago

That's very cool @belisards - thank you! word level timestamps probably isn't important for most research, so those sentence-level ones look really good!

belisards commented 1 month ago

Yep, I agree but look: even in its vanilla version, Whisper generates timestamps. Here it is in a notebook I created for a workshop last year: https://github.com/belisards/nlp_intro/blob/main/whisper.ipynb

Another feature that might be useful for open-source research is speech diarization. There is is a great video covering many Whisper variants and features like this: https://www.youtube.com/watch?v=Thc0vtnWYOo