huggingface / speechbox

Apache License 2.0
344 stars 34 forks source link

AttributeError: 'Annotation' object has no attribute 'for_json' #36

Closed alvynabranches closed 4 months ago

alvynabranches commented 6 months ago

Tested versions

Library Version
Python 3.12.2
Pyannote.audio 3.1.1
Pyannote.core 5.0.0
speechbox 0.2.1

System information

macOS 14.1 (23B2073) - M3 Max

Issue description

Code

from transformers import pipeline
from pyannote.audio import Pipeline
from speechbox import ASRDiarizationPipeline as ASRDP

diarization_pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1", use_auth_token=os.environ["HUGGINGFACE_TOKEN"])
asr_pipeline = pipeline("automatic-speech-recognition", model="openai/whisper-large-v3")
pipe = ASRDP(asr_pipeline=asr_pipeline, diarization_pipeline=diarization_pipeline)
output = pipe("audio.mp3")

Error

[/opt/homebrew/lib/python3.12/site-packages/tqdm/auto.py:21](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/tqdm/auto.py:21): TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
[/opt/homebrew/lib/python3.12/site-packages/pyannote/audio/core/io.py:43](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/pyannote/audio/core/io.py:43): UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
  torchaudio.set_audio_backend("soundfile")
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[8], [line 2](vscode-notebook-cell:?execution_count=8&line=2)
      [1](vscode-notebook-cell:?execution_count=8&line=1) with ProgressHook() as hook:
----> [2](vscode-notebook-cell:?execution_count=8&line=2)     output = pipe("audio.mp3", hook=hook)

File [/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:90](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:90), in ASRDiarizationPipeline.__call__(self, inputs, group_by_speaker, **kwargs)
     [83](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:83) inputs, diarizer_inputs = self.preprocess(inputs)
     [85](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:85) diarization = self.diarization_pipeline(
     [86](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:86)     {"waveform": diarizer_inputs, "sample_rate": self.sampling_rate},
     [87](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:87)     **kwargs,
     [88](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:88) )
---> [90](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:90) segments = diarization.for_json()["content"]
     [92](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:92) # diarizer output may contain consecutive segments from the same speaker (e.g. {(0 -> 1, speaker_1), (1 -> 1.5, speaker_1), ...})
     [93](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:93) # we combine these segments to give overall timestamps for each speaker's turn (e.g. {(0 -> 1.5, speaker_1), ...})
     [94](https://file+.vscode-resource.vscode-cdn.net/opt/homebrew/lib/python3.12/site-packages/speechbox/diarize.py:94) new_segments = []

AttributeError: 'Annotation' object has no attribute 'for_json'

This was tried on Jupyter Notebook on local device as well as on Google Collab. The error remains the same.

Minimal reproduction example (MRE)

AttributeError: 'Annotation' object has no attribute 'for_json'

alvynabranches commented 6 months ago

Pyannote.audio said that this is a speechbox mistake.

jsrozner commented 6 months ago

See my comment here: https://github.com/huggingface/speechbox/pull/26 @patrickvonplaten

(manually applying the patch locally from that pull request does fix the problem for me)

alvynabranches commented 4 months ago

When will this issue be solved? @patrickvonplaten @jsrozner

alvynabranches commented 4 months ago

Looks like the developers for this project are dead. And updates will be only done after the funeral of the developers. 😂😂😂

jsrozner commented 4 months ago

Use this pip install git+https://github.com/huggingface/speechbox.git (run pip uninstall speechbox first if you already have it installed)

You might be able to create a pull req for a new release so that pip install speechbox does work

alvynabranches commented 4 months ago

Solved.