Closed shirayu closed 1 year ago
Hi @shirayu ,
I see you have reverted my PR to fix #23. I understand that the tokenizer does not support a multilanguage mode, however, multilanguage transcription works fine on my end. I think there is an error in the code you have edited, as language should be None and not opts.language . Thanks for looking into this :)
I read the whisper code and noticed that multilingual tokenizer is not supposed in Whisper.
When
language
isNone
, the tokenizer is not for all languages but for English (en
) for "multilingual whisper models" (tiny, base, small, medium, large).https://github.com/openai/whisper/blob/9e653bd0ea0f1e9493cb4939733e9de249493cfb/whisper/tokenizer.py#L295-L316
Revert #20 Related to #21