MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
2.44k stars 238 forks source link

TypeError: unsupported operand type(s) for |: 'type' and 'type' #193

Closed franchukpetro closed 1 month ago

franchukpetro commented 1 month ago

Hi,

Seems like I've successfully installed all dependencies, but when simply running python diarize.py -a audio_file.mp3 it gives me the next issues:

/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/pyannote/audio/core/io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
  torchaudio.set_audio_backend("soundfile")
torchvision is not available - cannot save figures
Traceback (most recent call last):
  File "/home/petro/DEV/asr_testing/whisper-diarization/diarize.py", line 3, in <module>
    from helpers import (
  File "/home/petro/DEV/asr_testing/whisper-diarization/helpers.py", line 7, in <module>
    from whisperx.alignment import DEFAULT_ALIGN_MODELS_HF, DEFAULT_ALIGN_MODELS_TORCH
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/whisperx/__init__.py", line 1, in <module>
    from .transcribe import load_model
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/whisperx/transcribe.py", line 10, in <module>
    from .asr import load_model
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/whisperx/asr.py", line 13, in <module>
    from .vad import load_vad_model, merge_chunks
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/whisperx/vad.py", line 11, in <module>
    from pyannote.audio.pipelines import VoiceActivityDetection
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/pyannote/audio/pipelines/__init__.py", line 26, in <module>
    from .speaker_diarization import SpeakerDiarization
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/pyannote/audio/pipelines/speaker_diarization.py", line 42, in <module>
    from pyannote.audio.pipelines.speaker_verification import PretrainedSpeakerEmbedding
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/pyannote/audio/pipelines/speaker_verification.py", line 56, in <module>
    from nemo.collections.asr.models import (
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/__init__.py", line 15, in <module>
    from nemo.collections.asr import data, losses, models, modules
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/losses/__init__.py", line 16, in <module>
    from nemo.collections.asr.losses.audio_losses import SDRLoss
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/losses/audio_losses.py", line 21, in <module>
    from nemo.collections.asr.parts.preprocessing.features import make_seq_mask_like
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/parts/preprocessing/__init__.py", line 16, in <module>
    from nemo.collections.asr.parts.preprocessing.features import FeaturizerFactory, FilterbankFeatures, WaveformFeaturizer
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/parts/preprocessing/features.py", line 44, in <module>
    from nemo.collections.asr.parts.preprocessing.perturb import AudioAugmentor
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/parts/preprocessing/perturb.py", line 50, in <module>    from nemo.collections.common.parts.preprocessing import collections, parsers
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/common/__init__.py", line 16, in <module>
    from nemo.collections.common import data, losses, parts, tokenizers
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/common/tokenizers/__init__.py", line 17, in <module>
    from nemo.collections.common.tokenizers.canary_tokenizer import CanaryTokenizer
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/common/tokenizers/canary_tokenizer.py", line 51, in <module>
    class CanaryTokenizer(AggregateTokenizer):
  File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/common/tokenizers/canary_tokenizer.py", line 105, in CanaryTokenizer
    def build_special_tokenizer(output_dir: str | Path) -> SentencePieceTokenizer:
TypeError: unsupported operand type(s) for |: 'type' and 'type'

I'm not sure whether it isn't some wrong installation process on my side, since I've changed it a little bit to work on my current cuda setup:

conda create -n whisper_diarization_venv python=3.9
conda activate whisper_diarization_venv
conda install pytorch==2.1.2
conda install conda-forge::cython
pip install torchaudio==2.1.2
pip install setuptools==69.5.1
pip install -r requirements.txt

Audio itself is ~1h people conversation in German.

Would be grateful for the help here!

MahmoudAshraf97 commented 1 month ago

First make sure you have the most updated version of the repo And if the problem still persists, upload an audio file to reproduce the problem

On Sun, May 26, 2024, 1:52 PM Franchuk Petro @.***> wrote:

Hi,

Seems like I've successfully installed all dependencies, but when simply running python diarize.py -a audio_file.mp3 it gives me the next issues:

/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/pyannote/audio/core/io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call. torchaudio.set_audio_backend("soundfile") torchvision is not available - cannot save figures Traceback (most recent call last): File "/home/petro/DEV/asr_testing/whisper-diarization/diarize.py", line 3, in from helpers import ( File "/home/petro/DEV/asr_testing/whisper-diarization/helpers.py", line 7, in from whisperx.alignment import DEFAULT_ALIGN_MODELS_HF, DEFAULT_ALIGN_MODELS_TORCH File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/whisperx/init.py", line 1, in from .transcribe import load_model File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/whisperx/transcribe.py", line 10, in from .asr import load_model File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/whisperx/asr.py", line 13, in from .vad import load_vad_model, merge_chunks File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/whisperx/vad.py", line 11, in from pyannote.audio.pipelines import VoiceActivityDetection File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/pyannote/audio/pipelines/init.py", line 26, in from .speaker_diarization import SpeakerDiarization File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/pyannote/audio/pipelines/speaker_diarization.py", line 42, in from pyannote.audio.pipelines.speaker_verification import PretrainedSpeakerEmbedding File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/pyannote/audio/pipelines/speaker_verification.py", line 56, in from nemo.collections.asr.models import ( File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/init.py", line 15, in from nemo.collections.asr import data, losses, models, modules File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/losses/init.py", line 16, in from nemo.collections.asr.losses.audio_losses import SDRLoss File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/losses/audio_losses.py", line 21, in from nemo.collections.asr.parts.preprocessing.features import make_seq_mask_like File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/parts/preprocessing/init.py", line 16, in from nemo.collections.asr.parts.preprocessing.features import FeaturizerFactory, FilterbankFeatures, WaveformFeaturizer File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/parts/preprocessing/features.py", line 44, in from nemo.collections.asr.parts.preprocessing.perturb import AudioAugmentor File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/asr/parts/preprocessing/perturb.py", line 50, in from nemo.collections.common.parts.preprocessing import collections, parsers File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/common/init.py", line 16, in from nemo.collections.common import data, losses, parts, tokenizers File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/common/tokenizers/init.py", line 17, in from nemo.collections.common.tokenizers.canary_tokenizer import CanaryTokenizer File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/common/tokenizers/canary_tokenizer.py", line 51, in class CanaryTokenizer(AggregateTokenizer): File "/home/petro/miniforge3/envs/whisper_diarization_venv/lib/python3.9/site-packages/nemo/collections/common/tokenizers/canary_tokenizer.py", line 105, in CanaryTokenizer def build_special_tokenizer(output_dir: str | Path) -> SentencePieceTokenizer: TypeError: unsupported operand type(s) for |: 'type' and 'type'

I'm not sure whether it isn't some wrong installation process on my side, since I've changed it a little bit to work on my current cuda setup:

conda create -n whisper_diarization_venv python=3.9 conda activate whisper_diarization_venv conda install pytorch==2.1.2 conda install conda-forge::cython pip install torchaudio==2.1.2 pip install setuptools==69.5.1 pip install -r requirements.txt

Would be grateful for the help here!

— Reply to this email directly, view it on GitHub https://github.com/MahmoudAshraf97/whisper-diarization/issues/193, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHXHGLA7FMJZ6CCNRAZG62LZEG5E7AVCNFSM6AAAAABIJWG6MSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGMYTONRUGAYDENQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

franchukpetro commented 1 month ago

I cloned repo just a few hours ago, don't see any new commits since then, so I consider it's up-to-date.

I fixed that specific issue, seems like reason was Python 3.9 version, 3.10 resolved that. However, I'm getting a new one - ValueError: Could not find the operator torchvision::nms. Please make sure you have already registered the operator and (if registered from C++) loaded it via torc h.ops.load_library.

Not sure why torchvision is even involved here, but it might be some bad torch installation on my side, working on that, will let you know whether I came up with some working installation procedure.

franchukpetro commented 1 month ago

Okay, I've tried several different combinations, using conda and pip, pure pip, installing specific torch version, etc.

Right now I'm just creating Conda env, and doing next stuff:

pip install cython torch
pip install -r requirements.txt

pip install -r requirements.txt will likely end up with some issues for not compatible packages versions, which I kinda fix with pip install setuptools -U --force-reinstall. But setup tools 70.0.0 results in some errors regarding packaging package, so I do pip install setuptools==69.5.1, which resolves that and finally requirements seems to be installed.

But, I'm constantly getting next error: File "/home/petro/.local/lib/python3.10/site-packages/inflect/__init__.py", line 80, in <module> from typeguard import typechecked ImportError: cannot import name 'typechecked' from 'typeguard' (unknown location)

franchukpetro commented 1 month ago

Seems like I've found working setup for me. To install dependencies I'm doing next steps in the conda env:

pip install cython torch
pip install -r requirements.txt

If there are some issues connected to the incompatible packages versions, I'm doing pip install setuptools -U --force-reinstall, while that sometimes may cause issues with packaging package, which I fix with pip install setuptools==69.5.1 (basically doing several iterations of setuptools installation and pip install -r requirements.txt) until there are no issues.

Finally, to fix the issue with typeguard I'm downgrading it - pip install typeguard==2.13.3.

@MahmoudAshraf97 thanks for responding, hopefully there won't be much issues further. Great repo, thanks!