m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
11.9k stars 1.25k forks source link

OSError: [WinError 1314] #283

Open VimWei opened 1 year ago

VimWei commented 1 year ago

When I run the python usage example:

# 3. Assign speaker labels
diarize_model = whisperx.DiarizationPipeline(use_auth_token="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", device=device)

# add min/max number of speakers if known
diarize_segments = diarize_model(audio_file)
# diarize_model(audio_file, min_speakers=min_speakers, max_speakers=max_speakers)

The downloading is ok:

Downloading pytorch_model.bin: 100%|█| 17.7M/17.7M [00:00<00:00, 28.1
Downloading (…)/2022.07/config.yaml: 100%|██| 318/318 [00:00<?, ?B/s]
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.2. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\chenw\.cache\torch\pyannote\models--pyannote--segmentation\snapshots\c4c8ceafcbb3a7a280c2d357aee9fbc9b0be7f9b\pytorch_model.bin`
Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.0.0+cpu. Bad things might happen unless you revert torch to 1.x.
Downloading (…)ain/hyperparams.yaml: 100%|█| 1.92k/1.92k [00:00<00:00

But there is an OSError, What should I do?

OSError                              Traceback (most recent call last)
Cell In[7], line 2
      1 # 3. Assign speaker labels
----> 2 diarize_model = whisperx.DiarizationPipeline(use_auth_token="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", device=device)
      4 # add min/max number of speakers if known
      5 diarize_segments = diarize_model(audio_file)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\whisperx\diarize.py:16, in DiarizationPipeline.__init__(self, model_name, use_auth_token, device)
     14 if isinstance(device, str):
     15     device = torch.device(device)
---> 16 self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token).to(device)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyannote\audio\core\pipeline.py:135, in Pipeline.from_pretrained(cls, checkpoint_path, hparams_file, use_auth_token, cache_dir)
    133 params = config["pipeline"].get("params", {})
    134 params.setdefault("use_auth_token", use_auth_token)
--> 135 pipeline = Klass(**params)
    137 # freeze  parameters
    138 if "freeze" in config:

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyannote\audio\pipelines\speaker_diarization.py:165, in SpeakerDiarization.__init__(self, segmentation, segmentation_duration, segmentation_step, embedding, embedding_exclude_overlap, clustering, embedding_batch_size, segmentation_batch_size, der_variant, use_auth_token)
    162     metric = "not_applicable"
    164 else:
--> 165     self._embedding = PretrainedSpeakerEmbedding(
    166         self.embedding, use_auth_token=use_auth_token
    167     )
    168     self._audio = Audio(sample_rate=self._embedding.sample_rate, mono="downmix")
    169     metric = self._embedding.metric

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyannote\audio\pipelines\speaker_verification.py:490, in PretrainedSpeakerEmbedding(embedding, device, use_auth_token)
    458 """Pretrained speaker embedding
    459 
    460 Parameters
   (...)
    486 >>> embeddings = get_embedding(waveforms, masks=masks)
    487 """
    489 if isinstance(embedding, str) and "speechbrain" in embedding:
--> 490     return SpeechBrainPretrainedSpeakerEmbedding(
    491         embedding, device=device, use_auth_token=use_auth_token
    492     )
    494 elif isinstance(embedding, str) and "nvidia" in embedding:
    495     return NeMoPretrainedSpeakerEmbedding(embedding, device=device)

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\pyannote\audio\pipelines\speaker_verification.py:249, in SpeechBrainPretrainedSpeakerEmbedding.__init__(self, embedding, device, use_auth_token)
    246 self.device = device or torch.device("cpu")
    247 self.use_auth_token = use_auth_token
--> 249 self.classifier_ = SpeechBrain_EncoderClassifier.from_hparams(
    250     source=self.embedding,
    251     savedir=f"{CACHE_DIR}/speechbrain",
    252     run_opts={"device": self.device},
    253     use_auth_token=self.use_auth_token,
    254     revision=self.revision,
    255 )

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\speechbrain\pretrained\interfaces.py:367, in Pretrained.from_hparams(cls, source, hparams_file, pymodule_file, overrides, savedir, use_auth_token, revision, download_only, **kwargs)
    365     clsname = cls.__name__
    366     savedir = f"./pretrained_models/{clsname}-{hashlib.md5(source.encode('UTF-8', errors='replace')).hexdigest()}"
--> 367 hparams_local_path = fetch(
    368     filename=hparams_file,
    369     source=source,
    370     savedir=savedir,
    371     overwrite=False,
    372     save_filename=None,
    373     use_auth_token=use_auth_token,
    374     revision=revision,
    375 )
    376 try:
    377     pymodule_local_path = fetch(
    378         filename=pymodule_file,
    379         source=source,
   (...)
    384         revision=revision,
    385     )

File ~\AppData\Local\Programs\Python\Python311\Lib\site-packages\speechbrain\pretrained\fetching.py:135, in fetch(filename, source, savedir, overwrite, save_filename, use_auth_token, revision)
    133     sourcepath = pathlib.Path(fetched_file).absolute()
    134     _missing_ok_unlink(destination)
--> 135     destination.symlink_to(sourcepath)
    136 return destination

File ~\AppData\Local\Programs\Python\Python311\Lib\pathlib.py:1198, in Path.symlink_to(self, target, target_is_directory)
   1196 if not hasattr(os, "symlink"):
   1197     raise NotImplementedError("os.symlink() not available on this system")
-> 1198 os.symlink(target, self, target_is_directory)

OSError: [WinError 1314] 客户端没有所需的特权。: 'C:\\Users\\chenw\\.cache\\huggingface\\hub\\models--speechbrain--spkrec-ecapa-voxceleb\\snapshots\\5c0be3875fda05e81f3c004ed8c7c06be308de1e\\hyperparams.yaml' -> 'C:\\Users\\chenw\\.cache\\torch\\pyannote\\speechbrain\\hyperparams.yaml'
sauces88 commented 1 year ago

+1

jonaOp commented 1 year ago

But there is an OSError, What should I do?

You can try to run with admin priviliges. That fixed the issue for me.

pwrtux commented 1 year ago

got the same error, @jonaOp solution works.