-
Hello! I would like to use WhisperX and Pyannote to combine automatic transcription and diarization. I can do it on Colab using the Huggingface (HF) token, but I would like to avoid entering the HF to…
-
I use this line of code to transcribe and diarize at the same time :
```python
!pipx run insanely-fast-whisper --file-name "/content/drive/MyDrive/aurore.wav" --hf_token
```
but I get more s…
-
Hello developers! Thank you so much for developing Resemblyzer and it is an amazing tool for me.
I have actually been encountered a problems while developing, that when my input audio contains 3 spea…
-
Logic will be to combine Whisper + pyannote.audio based on timestamps to output something along the lines of:
```
Person A: Hi
Person B: Hello, how are you
Person A: I'm good, and you?
....
```
-
### Tested versions
Tested on 3.1 vs 3.0
### System information
Debian GNU/Linux, torch 2.1.2
### Issue description
When running diarization pipeline on CPU, v3.1 is more than 2x slower…
-
Keep getting this error whether I try diarization 3.0 or some other version despite accepting the user aggrements on HF - are there any fixes here:
torchaudio.set_audio_backend("soundfile")
Could…
-
hyperpyyaml
>>Performing transcription...
>>Performing alignment...
>>Performing diarization...
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.2. To apply the upgrad…
-
### Describe the feature
If I provide a audio file with multiple channels - e.g. a m4a where it was recorded with multiple microphones, vibe currently only transcribes the first channel :(
good = …
-
hi, thanks for your codes.
I am trying to use model of "TOLD : A novel two-stage overlap-aware framework for Speaker Diarization", but cannot find the model(Found only eend-ola code).
How can I expe…
-
Podcasts are usually conversations so voice recognition is needed to identify the *author* and extract question and answer pairs from the transcript. Similar to video ingestion.