This PR aims to unify language detection in both WhisperModel and BatchedInferencePipeline
Summary:
Supported new options for batched transcriptions:
language_detection_threshold
language_detection_segments
Updated WhisperModel.detect_language function to include the improved language detection from #732 and added docstrings, it's now used inside transcribe function.
Removed the following functions as they are no longer needed:
WhisperModel.detect_language_multi_segment and its test
This PR aims to unify language detection in both
WhisperModel
andBatchedInferencePipeline
Summary:
language_detection_threshold
language_detection_segments
WhisperModel.detect_language
function to include the improved language detection from #732 and added docstrings, it's now used insidetranscribe
function.WhisperModel.detect_language_multi_segment
and its testBatchedInferencePipeline.get_language_and_tokenizer