m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
11.92k stars 1.26k forks source link

Is batch transcribe multiple audio files supported? #589

Open salahzoubi opened 11 months ago

salahzoubi commented 11 months ago

As of recently, it seems like the whisper large v3 supports batch transcribing (i.e. multiple audio files at once), is this feature available in whisperX with large v3? If so, can someone share a small code snippet of how it works?

And if not, are there plans to add this feature? It would be super useful for usage!

Khaztaroth commented 11 months ago

The way I've been doing it is by taking advantage of PowerSheell and using

Get-ChildItem -filter "*.mp4" | ForEach-Object { whisperx $_.name SETTINGS }

So powershell gets any file with that extension, could be anything that whisperx supports. Then for each file that it finds it will run whisperx, finish the process, and move on to the next one.

$_.name fills in the file name, you could use $_.FullName if you want the full path.

A native option would be neat but this works just as well.

salahzoubi commented 11 months ago

@Khaztaroth this does not make use of the available space on the GPU and does not run in parallel unfortunately. This feature is already supported on HF pipeline, so I was wondering if the authors can support it here...

Khaztaroth commented 11 months ago

I see, I missed the part about "at once"

Kuiriel commented 3 months ago

The way I've been doing it is by taking advantage of PowerSheell and using

Get-ChildItem -filter "*.mp4" | ForEach-Object { whisperx $_.name SETTINGS }

So powershell gets any file with that extension, could be anything that whisperx supports. Then for each file that it finds it will run whisperx, finish the process, and move on to the next one.

$_.name fills in the file name, you could use $_.FullName if you want the full path.

A native option would be neat but this works just as well.

Thank you so much. This saves me a bunch of time when transcribing many audio files for my wife. I can now walk away and parent while the menial secretarial work takes care of itself.

To get it running in powershell, all I had to do was launch an Anaconda Powershell Prompt, but there are other solutions here: https://stackoverflow.com/questions/64149680/how-can-i-activate-a-conda-environment-from-powershell