MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.43k stars 288 forks source link

ContextualVersionConflict / Could not find the operator torchvision::nms #198

Closed derflotzi closed 3 months ago

derflotzi commented 3 months ago

I installed the project on Colab using the following commands:

!git clone https://github.com/MahmoudAshraf97/whisper-diarization.git
!sudo apt update && sudo apt install cython3
!sudo apt update && sudo apt install ffmpeg
!pip install -r whisper-diarization/requirements.txt

When I execute diarization, the following error occurs:

2024-06-25 09:58:00.645130: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-06-25 09:58:00.645539: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-06-25 09:58:00.647644: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-06-25 09:58:02.502296: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py", line 132, in _check_requirement
    pkg_resources.require(self.requirement)
  File "/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py", line 966, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py", line 827, in resolve
    dist = self._resolve_dist(
  File "/usr/local/lib/python3.10/dist-packages/pkg_resources/__init__.py", line 873, in _resolve_dist
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (torch 2.1.2 (/usr/local/lib/python3.10/dist-packages), Requirement.parse('torch==2.3.0'), {'torchvision'})

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/content/whisper-diarization/diarize.py", line 3, in <module>
    from helpers import (
  File "/content/whisper-diarization/helpers.py", line 7, in <module>
    from whisperx.alignment import DEFAULT_ALIGN_MODELS_HF, DEFAULT_ALIGN_MODELS_TORCH
  File "/usr/local/lib/python3.10/dist-packages/whisperx/__init__.py", line 1, in <module>
    from .transcribe import load_model
  File "/usr/local/lib/python3.10/dist-packages/whisperx/transcribe.py", line 10, in <module>
    from .asr import load_model
  File "/usr/local/lib/python3.10/dist-packages/whisperx/asr.py", line 13, in <module>
    from .vad import load_vad_model, merge_chunks
  File "/usr/local/lib/python3.10/dist-packages/whisperx/vad.py", line 9, in <module>
    from pyannote.audio import Model
  File "/usr/local/lib/python3.10/dist-packages/pyannote/audio/__init__.py", line 29, in <module>
    from .core.inference import Inference
  File "/usr/local/lib/python3.10/dist-packages/pyannote/audio/core/inference.py", line 34, in <module>
    from pytorch_lightning.utilities.memory import is_oom_error
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/__init__.py", line 26, in <module>
    from pytorch_lightning.callbacks import Callback  # noqa: E402
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/__init__.py", line 14, in <module>
    from pytorch_lightning.callbacks.batch_size_finder import BatchSizeFinder
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/batch_size_finder.py", line 24, in <module>
    from pytorch_lightning.callbacks.callback import Callback
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/callback.py", line 22, in <module>
    from pytorch_lightning.utilities.types import STEP_OUTPUT
  File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/types.py", line 25, in <module>
    from torchmetrics import Metric
  File "/usr/local/lib/python3.10/dist-packages/torchmetrics/__init__.py", line 26, in <module>
    from torchmetrics import functional  # noqa: E402
  File "/usr/local/lib/python3.10/dist-packages/torchmetrics/functional/__init__.py", line 50, in <module>
    from torchmetrics.functional.detection._deprecated import _panoptic_quality as panoptic_quality
  File "/usr/local/lib/python3.10/dist-packages/torchmetrics/functional/detection/__init__.py", line 24, in <module>
    if _TORCHVISION_AVAILABLE and _TORCHVISION_GREATER_EQUAL_0_8:
  File "/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py", line 164, in __bool__
    self._check_available()
  File "/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py", line 158, in _check_available
    self._check_requirement()
  File "/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py", line 142, in _check_requirement
    self.available = module_available(module)
  File "/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py", line 61, in module_available
    importlib.import_module(module_path)
  File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/usr/local/lib/python3.10/dist-packages/torchvision/__init__.py", line 6, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
  File "/usr/local/lib/python3.10/dist-packages/torchvision/_meta_registrations.py", line 164, in <module>
    def meta_nms(dets, scores, iou_threshold):
  File "/usr/local/lib/python3.10/dist-packages/torch/_custom_ops.py", line 253, in inner
    custom_op = _find_custom_op(qualname, also_check_torch_library=True)
  File "/usr/local/lib/python3.10/dist-packages/torch/_custom_op/impl.py", line 1076, in _find_custom_op
    overload = get_op(qualname)
  File "/usr/local/lib/python3.10/dist-packages/torch/_custom_op/impl.py", line 1062, in get_op
    error_not_found()
  File "/usr/local/lib/python3.10/dist-packages/torch/_custom_op/impl.py", line 1052, in error_not_found
    raise ValueError(
ValueError: Could not find the operator torchvision::nms. Please make sure you have already registered the operator and (if registered from C++) loaded it via torch.ops.load_library.

Any ideas how to fix this?

MahmoudAshraf97 commented 3 months ago

I just tried your code on a new notebook but I couldn't reproduce the error

derflotzi commented 3 months ago

This is how I start execution (on T4 GPU): !python whisper-diarization/diarize.py --whisper-model large-v3 --no-stem -a procontra.mp3

MahmoudAshraf97 commented 3 months ago

make sure that you are using compatible versions of these packages as the error isn't related to this repo: torch torchaudio torchvision lightning

WalidHadri-Iron commented 3 months ago

@derflotzi I had the same issue, I fixed it with reinstalling the torch torchvision torchaudio at the end. Check here for compatibility with your Cuda. On collab this will do I think pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

derflotzi commented 3 months ago

@WalidHadri-Iron Works fine! Thank you for help!