m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
11.96k stars 1.26k forks source link

Download in a directory option not available for DiarizaitonPipeline? #674

Open utility-aagrawal opened 9 months ago

utility-aagrawal commented 9 months ago

Hi,

I need to have all my models locally for my use case. I see that transcription has download_root parameter and align model has mode_dir parameter for this purpose. Is there no such parameter for DiarizationPipeline?

I can submit a PR for this change if you think this feature would be helpful. Let me know. I believe it will really useful for people who want to run everything locally. I understand that once model is downloaded from huggingface, it runs locally but I don't have an option to download in my target environment and need to have my models locally available beforehand. Thanks!

utility-aagrawal commented 9 months ago

@m-bain , Thanks for sharing your impressive work! Could you comment on this request please?

utility-aagrawal commented 8 months ago

@m-bain , Could you comment on this please?

metheofanis commented 8 months ago

Diarization is running locally anyway. It is only "gated" to hugging face. If you want to run it "ungated", see this FAQ section in pyannote githab
The second option reads: "Can I use gated models (and pipelines) offline?"

This helped me do what you have described.

utility-aagrawal commented 8 months ago

Thanks for your suggestion, @metheofanis! I'll check it out.