m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
10.17k stars 1.07k forks source link

Huggingface Authentication Issues probably related to pyannote #498

Open Dannypeja opened 9 months ago

Dannypeja commented 9 months ago

Are we having this again? Please confirm that I am not stupid. I have been desperate for 6 hours. I thought it was my fault somehow.

490

pyannote related issue

I am getting this

Could not download 'pyannote/speaker-diarization-3.0' pipeline.
It might be because the pipeline is private or gated so make
sure to authenticate. Visit https://hf.co/settings/tokens to
create your access token and retry with:

   >>> Pipeline.from_pretrained('pyannote/speaker-diarization-3.0',
   ...                          use_auth_token=YOUR_AUTH_TOKEN)
Dannypeja commented 9 months ago

Weird. I am only having issues when I use it inside Docker Container. Natively it is fine. Any Ideas?

kaihe-stori commented 9 months ago

Did you accept terms for the new 3.0 model cards?

If 3.0 isn't working for you (it is slower than 2.1 as discussed #499), then try to go back to 2.1

model = whisperx.DiarizationPipeline(model_name='pyannote/speaker-diarization@2.1', use_auth_token=YOUR_AUTH_TOKEN, device='cuda')
Dannypeja commented 9 months ago

I did not. Now I did. Still same error. Interestingly it works on host, worked even before I accepted terms. But inside a docker container it does not work.

How can I go back to 2.1 when using whisperx cli?

Dannypeja commented 9 months ago

Okay not only do I need to accept 3.0 diarization but also segmentation: https://hf.co/pyannote/segmentation-3.0

Wow, a full day just for this.

Yes it seems to be slower. I would still be interested to learn how I can revert to 2.1 using whisperx CLI

kaihe-stori commented 9 months ago

Yes, both segmentation and diarization. You can try my hack in #499 to make 3.0 work fast.

Dannypeja commented 9 months ago

I'll try it, tanks! Is 3.0 faster that 2.1? Otherwise I think I'll stay at 2.1 cause it works fine..

kaihe-stori commented 9 months ago

Slightly faster from the limited samples I saw. But I am sticking with 2.1 until no hack is needed for 3.0.

7k50 commented 9 months ago

Currently readme.md appears to link to the previous segmentation model on Huggingface:

https://huggingface.co/pyannote/segmentation

instead of:

https://huggingface.co/pyannote/segmentation-3.0

Uzzije commented 9 months ago

I believe you have to accept both of these two conditions - https://huggingface.co/pyannote/segmentation-3.0 and https://huggingface.co/pyannote/speaker-diarization-3.0

freshpearYoon commented 6 months ago

I believe you have to accept both of these two conditions - https://huggingface.co/pyannote/segmentation-3.0 and https://huggingface.co/pyannote/speaker-diarization-3.0

hi, I accepted both , but I am having same problem. Is there anything that I can do more?

kaihe-stori commented 6 months ago

A newer diarization model (3.1) is being used by whisperx. Accept this one instead https://huggingface.co/pyannote/speaker-diarization-3.1

ZizhuangCui commented 2 months ago

1. visit hf.co/pyannote/speaker-diarization and accept user conditions

2. visit hf.co/pyannote/segmentation and accept user conditions

I currently accept segmentation, segmentation3.0, diarisation3.0, diarizaton3.1 and still have no permissions