issues
search
m-bain
/
whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
12.61k
stars
1.33k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
does torch version or pyannote version warning important?
#933
ywangwxd
opened
23 minutes ago
0
Default config of `without_timestamps=True` affects whisper transcript quality.
#932
Artaches
opened
8 hours ago
0
WhisperX missing pieces of transcript compared to Whisper API
#931
tomhayw
opened
14 hours ago
1
Update MANIFEST.in to include necessary files
#930
frostming
opened
17 hours ago
0
Solution for Timestamps Not Appearing When Using Other Languages Like English in Korean Language Models
#929
THePhanT00M
opened
23 hours ago
0
crisper whisper just pluggable?
#928
kunibald413
opened
1 day ago
0
A module that was compiled using NumPy 1.x cannot be run in NumPy 2.0.1 as it may crash
#927
2424004764
opened
4 days ago
11
Chunking with stride
#926
pramadikaegamo
opened
5 days ago
0
FileNotFoundError
#925
diaverso
opened
5 days ago
0
whisperx.DiarizationPipeline load long time
#924
smallpize
opened
5 days ago
0
Support Arabic Language
#923
abdelkrimkr
opened
1 week ago
0
Feat: add new align models - SHORT
#922
Equipo45
opened
1 week ago
0
WhisperX can Generate the N-best (top few) hypotheses?
#921
hpjang
opened
1 week ago
0
Fail to generate segment
#920
leinace1001
closed
1 week ago
1
Any ways to reduce or calibrate the offset of word timeline?
#919
leinace1001
opened
2 weeks ago
0
TranscriptionOptions.__new__() missing 1 required positional argument: 'hotwords'
#918
Tejes
opened
2 weeks ago
2
supress_numerals is eliminating numbers from transcription, not considering them words.
#917
juangea
opened
2 weeks ago
0
Regarding the issue of sentence length
#916
heartInsert
opened
3 weeks ago
2
API server hangs after a certain period
#915
dineshveguru
closed
2 weeks ago
1
I need advice on the Wav2Vec2 English model.
#914
sulutian
opened
3 weeks ago
0
feat: Enable optional dynamic prompting for the FasterWhisperPipeline
#913
jameshu88
opened
3 weeks ago
0
Allows n_samples to be passed in detect_language
#912
marcelovjunior
opened
3 weeks ago
0
Why do the numbers in the ASR results not have a start and end timestamp?
#911
hpjang
opened
3 weeks ago
4
Word-level timestamps not working with python implementation
#910
rkulyassa
opened
3 weeks ago
0
Dockerfile for transcription and Speaker Diarization
#909
kowshik24
opened
3 weeks ago
3
Finetuned large-v3 inference problem.
#908
sinisha
opened
4 weeks ago
1
Phoneme-Based ASR For Arabic
#907
MustaphaLargou25
opened
4 weeks ago
0
Bad things Error!
#906
Chiyan200
opened
1 month ago
1
whisperX not working with Google Collab?
#905
m01ali
opened
1 month ago
2
Compatible with latest faster-whisper
#904
latent-variable
opened
1 month ago
0
libcudnn_cnn.so.9.1.0 issue
#903
kowshik24
opened
1 month ago
2
Unable to load any of {libcudnn_cnn.so.9.1.0, libcudnn_cnn.so.9.1, libcudnn_cnn.so.9, libcudnn_cnn.so}
#902
Leandrocnf
closed
1 month ago
7
WhisperX in Google colab Unable to load any of {libcudnn_ops.so.9.1.0, libcudnn_ops.so.9.1, libcudnn_ops.so.9, libcudnn_ops.so}
#901
sijitang
closed
1 month ago
11
Multiple improvements: language detection per segment, VAD min duration on/off, unique speakers, pyproject.toml and more.
#900
cvl01
opened
1 month ago
2
Could not locate `cudnn_ops_infer64_8.dll`. Please make sure it is in your library path!
#899
YoungPhlo
opened
1 month ago
3
Whisper large V3 turbo support?
#898
utility-aagrawal
opened
1 month ago
4
Model is Downloaded but not loaded jonatasgrosman--wav2vec2-large-xlsr-53-japanese
#897
andriken
opened
1 month ago
0
whisperX removing silence / pauses
#896
tsmdt
opened
1 month ago
0
Turbo-V3
#894
brainer3220
opened
1 month ago
13
Why can't we do multilanguage forced aligment without loading a language-specific alignment model?
#893
empz
opened
1 month ago
2
Return top-k detected languages with probabilities
#892
danielmunioz
opened
1 month ago
0
whisper based simple cross-lingual speech recognition demo
#891
pika-online
opened
1 month ago
0
cpu utilisation maxes at 50% (conda?)
#890
chboishabba
opened
1 month ago
0
[Feature] Silero VAD support
#889
3manifold
opened
2 months ago
0
Silero VAD support
#888
3manifold
opened
2 months ago
8
RuntimeError: No position encodings are defined for positions >= 448, but got position 448
#887
RichardQin1
opened
2 months ago
0
How to load model?
#886
salekeennayeem
closed
1 month ago
1
main branch code is not consistent with 3.1.1 release
#885
sabn0
opened
2 months ago
0
compute_type whisperX transcription - option to use float32?
#884
valericac
closed
2 months ago
1
Just use this script to make the srt more readable for the end results. almost perfect, try it and share your thoughts.
#883
search620
opened
2 months ago
1
Next