'SpeakerDiarization' object has no attribute 'to'

m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

BSD 2-Clause "Simplified" License

11.26k stars 1.18k forks source link

Thanks for the great work here m-bain and contributors. I have previous versions successfully running, but cannot get the latest version to run. I am experiencing the error: 'SpeakerDiarization' object has no attribute 'to' when running the example code. the offending line is:

# 3. Assign speaker labels
diarize_model = whisperx.DiarizationPipeline(use_auth_token=YOUR_HF_TOKEN, device=device)

And the full error is:

Traceback (most recent call last):
  File "/home/.../.../py/whisperX_example.py", line 33, in <module>
    diarize_model = whisperx.DiarizationPipeline(use_auth_token=MY_HF_TOKEN, device=device)
  File "/home/.../.local/lib/python3.10/site-packages/whisperx/diarize.py", line 16, in __init__
    self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token).to(device)
  File "/home/.../.local/lib/python3.10/site-packages/pyannote/pipeline/pipeline.py", line 100, in __getattr__
    raise AttributeError(msg)
AttributeError: 'SpeakerDiarization' object has no attribute 'to'

...whisperx/diarize.py#L16 ...pyannote/pipeline/pipeline.py#L100

I am using an existing HF_TOKEN that works when used in previous versions.

if relevant, here is some of pip list:

pyannote.audio            2.1.1
pyannote.core             4.5
pyannote.database         4.1.3
pyannote.metrics          3.2.1
pyannote.pipeline         2.3
torch                     2.0.0+cu117
torch-audiomentations     0.11.0
torch-pitch-shift         1.2.2
torchaudio                2.0.1+cu117
torchmetrics              0.11.0
torchvision               0.15.1+cu117
whisperx                  3.1.1

Any guidance would be most appreciated. Has anyone else experienced this?

Update:

If I update my local ...whisperx/diarize.py#L16 from: self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token).to(device) to: self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token) The script and diarization run all the way through.

The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows. The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows. No language specified, language will be first be detected for each audio file (increases inference time). Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v1.9.5. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Ali\.cache\torch\whisperx-vad-segmentation.bin` >>Performing transcription... Warning: audio is shorter than 30s, language detection may be inaccurate. Detected language: en (1.00) in first 30s of audio... >>Performing alignment... >>Performing diarization... Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v1.9.5. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Ali\.cache\torch\pyannote\models--pyannote--segmentation\snapshots\c4c8ceafcbb3a7a280c2d357aee9fbc9b0be7f9b\pytorch_model.bin` Traceback (most recent call last): File "C:\Users\Ali\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\Ali\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "H:\ai-voice-cloning\venv\Scripts\whisperx.exe\__main__.py", line 7, in <module> File "H:\ai-voice-cloning\venv\lib\site-packages\whisperx\transcribe.py", line 203, in cli diarize_model = DiarizationPipeline(use_auth_token=hf_token, device=device) File "H:\ai-voice-cloning\venv\lib\site-packages\whisperx\diarize.py", line 16, in __init__ self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token).to(device) File "H:\ai-voice-cloning\venv\lib\site-packages\pyannote\pipeline\pipeline.py", line 100, in __getattr__ raise AttributeError(msg) AttributeError: 'SpeakerDiarization' object has no attribute 'to'

m-bain / whisperX

'SpeakerDiarization' object has no attribute 'to' #312