Closed agershun closed 5 months ago
Hello and thanks for the contribution, the language parameter is used here: https://github.com/MahmoudAshraf97/whisper-diarization/blob/39572386eb4170fc16440b770666f23ccf9bdc80/transcription_helpers.py#L70 and it has the same effect passing it when loading the model, in both cases it only affects the tokenizer, so this PR has no effect on the final result. Please correct me if I'm missing something
I tried the same file with and without this language parameter in whisperx.load_model().
You can see the difference in the logs:
In case without the parameter it does not take in account the language and tries to detect the language itself.
Or... you mean that this message is the warning only at the model loading time?
PS. Thank you very much for the program!
Or... you mean that this message is the warning only at the model loading time?
PS. Thank you very much for the program!
Exactly, adding the language parameter while loading the model only hides this warning, it's useful only when doing inference on multiple audio files which isn't the case here
The problem: If the audio has very long rings in the very beginning of the file, it miss the language parameter and try to detect the language automatically.
After the analysis fo the source code I found that you missed the language parameter in the whisper.load_model().