m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
12.66k stars 1.34k forks source link

Using large-v3 returns some segments in all uppercase #880

Open caryknoop opened 2 months ago

caryknoop commented 2 months ago

Using large-v3 returns some segments in all uppercase

RaulKite commented 1 month ago

You will improve that using an well formed initial prompt.