Closed agershun closed 4 months ago
Please, add the line:
language=language
into the file transcription_helpers.py in the transcribe_batched() function:
transcription_helpers.py
transcribe_batched()
https://github.com/MahmoudAshraf97/whisper-diarization/blob/39572386eb4170fc16440b770666f23ccf9bdc80/transcription_helpers.py#L64
# Faster Whisper batched whisper_model = whisperx.load_model( model_name, device, # <--- PLEASE ADD language=language HERE compute_type=compute_dtype, asr_options={"suppress_numerals": suppress_numerals}, )
The problem was: If the audio has very long rings in the very beginning of the file, it miss the language parameter and try to detect the language automatically.
PS. Sorry, that I did not make a PR
Just created the pull request:
https://github.com/MahmoudAshraf97/whisper-diarization/pull/149
Please, add the line:
into the file
transcription_helpers.py
in thetranscribe_batched()
function:https://github.com/MahmoudAshraf97/whisper-diarization/blob/39572386eb4170fc16440b770666f23ccf9bdc80/transcription_helpers.py#L64
The problem was: If the audio has very long rings in the very beginning of the file, it miss the language parameter and try to detect the language automatically.
PS. Sorry, that I did not make a PR