MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.78k stars 332 forks source link

Other languages #145

Open peregilk opened 11 months ago

peregilk commented 11 months ago

I noticed that passing on the language to WhisperX does not work if that language is not in the accepted list of WhisperX. However WhisperX does support loading your own model with specifying the Wav2Vec endpoint. Is this possible to do with whisper-diarization?

I want to load a Norwegian model: NbAiLabBeta/nb-whisper-small

Then I want WhisperX to accept the Waw2Vec-model: NbAiLab/nb-wav2vec2-300m-bokmaal

With WhsiperX, you can then do: whisperx examples/sample01.wav --model NbAiLabBeta/nb-whisper-small --align_model NbAiLab/nb-wav2vec2-300m-bokmaal --batch_size 4

This is an alternative to passing along --language that only works for {en, fr, de, es, it, ja, zh, nl, uk, pt} where WhisperX simply finds the preferred model.

MahmoudAshraf97 commented 11 months ago

thanks for the suggestion I'll work on it

peregilk commented 11 months ago

Great. FYI I submitted a pull request to WhisperX for accepting Norwegian (Bokmål and Nynorsk). This was accepted. In the next offcial release, Norwegian Bokmål (no) and Norwegian Nynorsk (nn) should be accepted.