whisperX Integration - Githubissues

This PR integrates WhisperX to our current pipeline. WhisperX enables batch computation which leads to faster inference. The current status is x25 realtime speed.

Major changes:

WhisperX integration
GPU based sbatch job submission
Optimize sbatch time duration instead of default 24h
Optimize batch_size based on GPU available memory to decrease inference time and avoiding CUDA OutOfMemory error.
Using ffmpeg in a subprocess for faster input file conversion
Parallelizing Pynnoate diarization and 'WhisperX` transcription for 2x pipeline speed

Required Files WhisperX requires this VAD model to be available in TORCH_HOME directory. TORCH_HOME can be accessed via torch.hub._get_torch_home() and in speech2text module is located at /scratch/shareddata/speech2text

Environment Variables For SPEECH2TEXT_CPUS_PER_TASK, 6 is enough as we are using

AaltoRSE / speech2text

whisperX Integration #3