linto-ai whisper-timestamped issues

linto-ai / whisper-timestamped

Multilingual Automatic Speech Recognition with word-level timestamps and confidence

GNU Affero General Public License v3.0

1.87k stars 149 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

How to add new _ALIGNMENT_HEADS?

#139 JC1DA closed 9 months ago
4
Please consider creating a node version

#138 55Cancri closed 6 months ago
1
Language detection for large-v3

#136 andruxa-smirnov closed 10 months ago
11
How to get the progress bar when transcribing through a Celery task ?

#135 dchapelet closed 10 months ago
2
Large-v3

#134 takefy-dev closed 10 months ago
11
German Audios translated instead of transcribed

#133 palibvb closed 10 months ago
2
Issue with large-v3 model testing and channel mismatch error during audio processing

#132 Nondzu closed 10 months ago
3
Unable to transcribe audio when using a fine-tuned whisper medium model

#130 yilmazay74 closed 10 months ago
2
Max length using python

#127 jeff11-1-1 closed 10 months ago
1
Key words in segment is missing

#126 mperetto closed 11 months ago
4
whisper-timestamped is not thread-safe (AssertionError when multi-threading)

#124 Rtut654 opened 11 months ago
2
Timestamp always start at 0.

#122 Thomcle closed 10 months ago
1
Question: dtw also possible for SeamlessMT4?

#121 doublex closed 1 year ago
0
Sentences that definitely should be getting picked up are not

#118 Thomasssb1 closed 1 year ago
1
Option to change unit time (seconds -> milliseconds)

#116 Thomasssb1 opened 1 year ago
0
Entire sentences are not transcribed

#115 mirix closed 10 months ago
2
NumbaDeprecationWarning

#114 xjq284 closed 10 months ago
1
Can I get phoneme from the word and score for its confidence ?

#113 tiennguyen12g closed 10 months ago
1
add support for intel arc gpus

#112 furkanguzel161 closed 10 months ago
0
simple split audio file example using whisper-timestamped

#111 silvacarl2 closed 10 months ago
3
Why does it take so long to process a 1 minute video?

#110 Root-FTW closed 10 months ago
3
Improve readability and grammar in README

#109 SimonBaars closed 1 year ago
1
end time to early, cuts off the last word

#108 saltarob opened 1 year ago
18
Punctuation and capitalisation

#106 mirix closed 1 year ago
3
option `--suppress_token` to reduce hallucinations / output special noise descriptions

#105 misutoneko closed 1 year ago
5
Feature Request: option to Allign the words with the vowel of the first syllable rather than the first consonant.

#103 JeromeNelsonC opened 1 year ago
0
What are the best settings for most accurate/perfect time stamps?

#101 silvacarl2 closed 1 year ago
2
split audito into pieces directly from output of timestamps?

#100 silvacarl2 closed 1 year ago
2
ok, this is amazing but how can we transofrm it into an API?

#98 silvacarl2 closed 10 months ago
4
Timing off by multiple seconds

#97 tjthejuggler closed 1 year ago
5
CUDA or is it me? Windows.

#96 pinballelectronica closed 10 months ago
2
Update transcribe.py

#95 anita-arch closed 10 months ago
3
generate duplicated phrases

#94 x180380 opened 1 year ago
8
Suggestion to change the way output-files are saved

#93 oep42 closed 1 year ago
3
Fixes typo in README.md

#92 NatanFreeman closed 1 year ago
2
Trouble with timings

#91 boxabirds closed 1 year ago
4
I want to get high performance - azure

#90 rairavi closed 1 year ago
1
Deprecation warning

#89 aeschylus closed 1 year ago
1
error for long (1 hr) hindi video - used large-v2 whisper model

#87 rairavi closed 1 year ago
3
Words are correct but regular subtitles appear too early and linger?

#85 2600box closed 1 year ago
3
Can't install on M2 Mac

#83 boxabirds closed 1 year ago
3
[Idea] Basic timestamp validation

#82 misutoneko opened 1 year ago
8
Improve Whisper transcription using transcript

#81 lenaten closed 1 year ago
14
AssertionError with --vad, only with medium model

#80 misutoneko closed 1 year ago
2
Try VAD with auditok

#78 Jeronymous closed 9 months ago
1
High Memory Usage and "Killed: 9" Error.

#75 mcgreenwood closed 1 year ago
5
VAD does not handle almost complete silence

#74 freddyertl closed 9 months ago
23
--vad impacts recognition accuracy

#72 freddyertl closed 1 year ago
6
End time of one word is the start time of the next one

#71 konradipipan closed 1 year ago
1
Regular whisper model is still downloaded when using hugginface models

#70 blueskyleaf closed 1 year ago
3

Previous Next