m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
11.36k stars 1.19k forks source link

Turning off timestamps? #857

Open IndolentKheper opened 1 month ago

IndolentKheper commented 1 month ago

I'm no good with Python, so I'm struggling to figure things out. I have no idea how to get "without_timestamps True" to work, tried adding it to arguments but it isn't recognized. Searched through everything and saw it referenced in asr.py but I have no idea what to do with it. I just want the text diarized without timestamps, so I can get output with the speakers differentiated but without timestamps cluttering the transcript.

Mapik0 commented 2 weeks ago

I was struggling with a similar problem, but with --initial_promp, I just couldn't figure out how to add it in the python script and was always getting that it isn't recognized. In one of the issues here https://github.com/m-bain/whisperX/issues/645#issuecomment-2294863365 , I've found how someone added it and since it's the same thing with asr.py it should work for you as well.

asr_options = {
"without_timestamps": True,
}

model = whisperx.load_model(model, device="cuda", language=language, asr_options=asr_options)