pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
http://pyannote.github.io
MIT License
5.48k stars 725 forks source link

AttributeError: 'PyanNet' object has no attribute 'example_output' when loading speaker diarization pipeline #1620

Open pweglik opened 5 months ago

pweglik commented 5 months ago

Tested versions

pyannote.audio==3.1.1

System information

Ubuntu 20.04

Issue description

My code:

from pyannote.audio import Pipeline
from pyannote.audio.pipelines import SpeakerDiarization
from pyannote.core import Annotation
import torch

pipeline: SpeakerDiarization = Pipeline.from_pretrained(
  "pyannote/speaker-diarization-3.1",
  use_auth_token="TOKEN").to(torch.device("cuda"))

results in:


    "name": "AttributeError",
    "message": "'PyanNet' object has no attribute 'example_output'",
    "stack": "---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[1], line 6
      3 from pyannote.core import Annotation
      4 import torch
----> 6 pipeline: SpeakerDiarization = Pipeline.from_pretrained(
      7   \"pyannote/speaker-diarization-3.1\",
      8   use_auth_token=\"TOKEN\").to(torch.device(\"cuda\"))

File ~/przemek/miniconda3/envs/meeting_summarizer/lib/python3.11/site-packages/pyannote/audio/core/pipeline.py:137, in Pipeline.from_pretrained(cls, checkpoint_path, hparams_file, use_auth_token, cache_dir)
    135 params.setdefault(\"use_auth_token\", use_auth_token)
    136 print(config)
--> 137 pipeline = Klass(**params)
    139 # freeze  parameters
    140 if \"freeze\" in config:

File ~/przemek/miniconda3/envs/meeting_summarizer/lib/python3.11/site-packages/pyannote/audio/pipelines/speaker_diarization.py:152, in SpeakerDiarization.__init__(self, segmentation, segmentation_step, embedding, embedding_exclude_overlap, clustering, embedding_batch_size, segmentation_batch_size, der_variant, use_auth_token)
    144 self._segmentation = Inference(
    145     model,
    146     duration=segmentation_duration,
   (...)
    149     batch_size=segmentation_batch_size,
    150 )
    151 print(model)
--> 152 self._frames: SlidingWindow = self._segmentation.model.example_output.frames
    154 if self._segmentation.model.specifications.powerset:
    155     self.segmentation = ParamDict(
    156         min_duration_off=Uniform(0.0, 1.0),
    157     )

File ~/przemek/miniconda3/envs/meeting_summarizer/lib/python3.11/site-packages/torch/nn/modules/module.py:1614, in Module.__getattr__(self, name)
   1612     if name in modules:
   1613         return modules[name]
-> 1614 raise AttributeError(\"'{}' object has no attribute '{}'\".format(
   1615     type(self).__name__, name))

AttributeError: 'PyanNet' object has no attribute 'example_output'"

Minimal reproduction example (MRE)

-

pweglik commented 5 months ago

I've dived down into the implementation and foudn out the problematic line was rearrange function in forward method of PyanNet. It crashed without leaving any trace. When I swapped :

rearrange(outputs, "batch feature frame -> batch frame feature")

to

torch.permute(outputs, (0, 2, 1))

model loaded correctly. Not sure what caused it, but might be something worth looking at. In einops repo I found similar issue: https://github.com/pytorch/pytorch/issues/94598

hbredin commented 5 months ago

Would you mind sharing a link to a Google Colab that one can just click and run to reproduce the issue?

pweglik commented 5 months ago

Sorry, I don't have time now and I'm not sure if you're allowed to install your own versions of everything on google collab. But for anyone looking working setup for me is:

python 3.10.13
torch==2.0.0+cu117
pyannote.pipeline==3.0.1

The bug was caused by python 3.11 and it occurred in einops library, so the bug is on their side. This may be an incentive to use torch.permute instead of einops.rearrange (it would remove unnecessary dependency)

hbredin commented 5 months ago

Adding cannot_reproduce label because, well, I cannot reproduce it.

KickItLikeShika commented 3 months ago

i'm getting the same error while trying to load the model

palvinderbhatia commented 2 months ago

I am getting the same issue.

yoesak commented 1 month ago

I also have the same issue

YasharF commented 3 days ago

@hbredin here is how I ran into it on Ubuntu 24.04; I hope it helps with reproducing the issue

from huggingface_hub import HfApi
available_pipelines = [p.modelId for p in HfApi().list_models(filter="pyannote-audio-pipeline")]
list(filter(lambda p: p.startswith("pyannote/"), available_pipelines))
from huggingface_hub import notebook_login
notebook_login()
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained(
    "pyannote/speaker-diarization-3.1", use_auth_token=True)
YasharF commented 3 days ago

Workaround solution: Use Python 3.10 Adding to my prior comment, the issue is compatibility with the newer versions of Python. I was able to get around this by switching to Python 3.10. After the above steps freeze the dependencies with pip freeze > requirements.txt and then

Install Python 3.10 in WSL/Ubuntu

 sudo add-apt-repository ppa:deadsnakes/ppa
 sudo apt-get update && sudo apt-get upgrade -y
 sudo apt install python3.10 python3.10-venv

Create a new venv with python 3.10, install the dependencies and reopen VS code

python3.10 -m venv p310venv
source p310venv/bin/activate
pip install -r requirements.txt
code .

Then pick the p310venv as the kernel and rerun the blocks in the notebook.