-
Hey guys,
Does anyone have a guide on how I could use a finetuned whisper model with whisperX?
I have already finetuned a base whisper model from Huggingface. I understand the next step is to co…
-
Running starter program, it detects the speakers. How do I get the speech?
My usecase is I like to output a transcript in following form
Start_time, End_Time, SpeechText, Speaker_id
meera updated
7 months ago
-
I have been using WhisperX for transcribing multi-speaker audio files and I enabled diarization to distinguish between different speakers. However, I noticed that the TXT format output does not includ…
-
I've been using WhisperLive with great success recently in multiple languages. Seriously amazing. I recently noticed the support for `initial_prompt` which was added in January, and tried applying i…
-
Hi, I have a text were the audio includes numbers (e.g. 16, 29, 32) and the `whisperx` loads the information and transcript perfect, but when I try to run the word alignment, I stumble upon an issue -…
-
I have updated the package to the latest version with the merged 3.0.1 version of pyannite audio.
However, I am still experiencing slow diarization processing times.
After checking the Task Manag…
-
Is there a way to run WhisperX locally with no internet once the specified model has been downloaded?
Once in a while I get an error running the inference due to Hugging Face being down (error msg be…
-
**Update** -- actually after the following fix, it works and generates the diarization.
**After** installing whisperX:
```bash
!pip install light-the-torch
!ltt install torch==1.13.1 torchvision==…
-
hi there, somehow I couldn't export srt file with word_timestamp enabled. there is not word-level srt file generated.
e.g.
`whisperx --verbose True --model large-v2 --language en --hf_token XXXXX…
-
Hello! I'm trying to understand the releases on PyPi. PyPi lists two releases: 3.1.2 and 3.1.1 (https://pypi.org/project/whisperx/), both published on February 6th this year. But here on GitHub, the l…