m-bain / whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
BSD 2-Clause "Simplified" License
11.26k stars 1.18k forks source link

'SpeakerDiarization' object has no attribute 'to' #312

Open patchworkfish opened 1 year ago

patchworkfish commented 1 year ago

Thanks for the great work here m-bain and contributors. I have previous versions successfully running, but cannot get the latest version to run. I am experiencing the error: 'SpeakerDiarization' object has no attribute 'to' when running the example code. the offending line is:

# 3. Assign speaker labels
diarize_model = whisperx.DiarizationPipeline(use_auth_token=YOUR_HF_TOKEN, device=device)

And the full error is:

Traceback (most recent call last):
  File "/home/.../.../py/whisperX_example.py", line 33, in <module>
    diarize_model = whisperx.DiarizationPipeline(use_auth_token=MY_HF_TOKEN, device=device)
  File "/home/.../.local/lib/python3.10/site-packages/whisperx/diarize.py", line 16, in __init__
    self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token).to(device)
  File "/home/.../.local/lib/python3.10/site-packages/pyannote/pipeline/pipeline.py", line 100, in __getattr__
    raise AttributeError(msg)
AttributeError: 'SpeakerDiarization' object has no attribute 'to'

...whisperx/diarize.py#L16 ...pyannote/pipeline/pipeline.py#L100

I am using an existing HF_TOKEN that works when used in previous versions.

if relevant, here is some of pip list:

pyannote.audio            2.1.1
pyannote.core             4.5
pyannote.database         4.1.3
pyannote.metrics          3.2.1
pyannote.pipeline         2.3
torch                     2.0.0+cu117
torch-audiomentations     0.11.0
torch-pitch-shift         1.2.2
torchaudio                2.0.1+cu117
torchmetrics              0.11.0
torchvision               0.15.1+cu117
whisperx                  3.1.1

Any guidance would be most appreciated. Has anyone else experienced this?

Update:

If I update my local ...whisperx/diarize.py#L16 from: self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token).to(device) to: self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token) The script and diarization run all the way through.

rikabi89 commented 1 year ago

Your fix unfortunely didnt make a difference for me.


The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.
The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.
No language specified, language will be first be detected for each audio file (increases inference time).
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v1.9.5. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Ali\.cache\torch\whisperx-vad-segmentation.bin`
>>Performing transcription...
Warning: audio is shorter than 30s, language detection may be inaccurate.
Detected language: en (1.00) in first 30s of audio...
>>Performing alignment...
>>Performing diarization...
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v1.9.5. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file C:\Users\Ali\.cache\torch\pyannote\models--pyannote--segmentation\snapshots\c4c8ceafcbb3a7a280c2d357aee9fbc9b0be7f9b\pytorch_model.bin`
Traceback (most recent call last):
  File "C:\Users\Ali\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Ali\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "H:\ai-voice-cloning\venv\Scripts\whisperx.exe\__main__.py", line 7, in <module>
  File "H:\ai-voice-cloning\venv\lib\site-packages\whisperx\transcribe.py", line 203, in cli
    diarize_model = DiarizationPipeline(use_auth_token=hf_token, device=device)
  File "H:\ai-voice-cloning\venv\lib\site-packages\whisperx\diarize.py", line 16, in __init__
    self.model = Pipeline.from_pretrained(model_name, use_auth_token=use_auth_token).to(device)
  File "H:\ai-voice-cloning\venv\lib\site-packages\pyannote\pipeline\pipeline.py", line 100, in __getattr__
    raise AttributeError(msg) AttributeError: 'SpeakerDiarization' object has no attribute 'to'