NavodPeiris / speechlib

speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
MIT License
138 stars 12 forks source link

Torch version conflict. #22

Closed gety9 closed 6 months ago

gety9 commented 6 months ago

I use speechlib on google colab. It was working fine when i set it up on Jan 22, but now i try to reuse same code and it fails:

!apt install libcublas11
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libcublaslt11
The following NEW packages will be installed:
  libcublas11 libcublaslt11
0 upgraded, 2 newly installed, 0 to remove and 39 not upgraded.
Need to get 226 MB of archives.
After this operation, 498 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy/multiverse amd64 libcublaslt11 amd64 11.7.4.6~11.5.1-1ubuntu1 [148 MB]
Get:2 http://archive.ubuntu.com/ubuntu jammy/multiverse amd64 libcublas11 amd64 11.7.4.6~11.5.1-1ubuntu1 [78.2 MB]
Fetched 226 MB in 10s (22.1 MB/s)
Selecting previously unselected package libcublaslt11:amd64.
(Reading database ... 121753 files and directories currently installed.)
Preparing to unpack .../libcublaslt11_11.7.4.6~11.5.1-1ubuntu1_amd64.deb ...
Unpacking libcublaslt11:amd64 (11.7.4.6~11.5.1-1ubuntu1) ...
Selecting previously unselected package libcublas11:amd64.
Preparing to unpack .../libcublas11_11.7.4.6~11.5.1-1ubuntu1_amd64.deb ...
Unpacking libcublas11:amd64 (11.7.4.6~11.5.1-1ubuntu1) ...
Setting up libcublaslt11:amd64 (11.7.4.6~11.5.1-1ubuntu1) ...
Setting up libcublas11:amd64 (11.7.4.6~11.5.1-1ubuntu1) ...
Processing triggers for libc-bin (2.35-0ubuntu3.4) ...
/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_5.so.3 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc_proxy.so.2 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbbmalloc.so.2 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbbbind.so.3 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbb.so.12 is not a symbolic link

/sbin/ldconfig.real: /usr/local/lib/libtbbbind_2_0.so.3 is not a symbolic link
!pip install speechlib
!pip install pathlib
!pip install pytube
Collecting speechlib
  Downloading speechlib-1.0.13-py3-none-any.whl (13 kB)
Collecting transformers==4.36.2 (from speechlib)
  Downloading transformers-4.36.2-py3-none-any.whl (8.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.2/8.2 MB 3.8 MB/s eta 0:00:00
Collecting torch==2.1.2 (from speechlib)
  Downloading torch-2.1.2-cp310-cp310-manylinux1_x86_64.whl (670.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 670.2/670.2 MB 700.3 kB/s eta 0:00:00
Collecting torchaudio==2.1.2 (from speechlib)
  Downloading torchaudio-2.1.2-cp310-cp310-manylinux1_x86_64.whl (3.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.3/3.3 MB 1.8 MB/s eta 0:00:00
Collecting pydub==0.25.1 (from speechlib)
  Downloading pydub-0.25.1-py2.py3-none-any.whl (32 kB)
Collecting pyannote.audio==3.1.1 (from speechlib)
  Downloading pyannote.audio-3.1.1-py2.py3-none-any.whl (208 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 208.7/208.7 kB 1.8 MB/s eta 0:00:00
Collecting speechbrain==0.5.16 (from speechlib)
  Downloading speechbrain-0.5.16-py3-none-any.whl (630 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 630.6/630.6 kB 1.9 MB/s eta 0:00:00
Collecting accelerate==0.26.1 (from speechlib)
  Downloading accelerate-0.26.1-py3-none-any.whl (270 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 270.9/270.9 kB 2.1 MB/s eta 0:00:00
Collecting faster-whisper==0.10.0 (from speechlib)
  Downloading faster_whisper-0.10.0-py3-none-any.whl (1.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 2.1 MB/s eta 0:00:00
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from accelerate==0.26.1->speechlib) (1.25.2)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from accelerate==0.26.1->speechlib) (24.0)
Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate==0.26.1->speechlib) (5.9.5)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from accelerate==0.26.1->speechlib) (6.0.1)
Requirement already satisfied: huggingface-hub in /usr/local/lib/python3.10/dist-packages (from accelerate==0.26.1->speechlib) (0.20.3)
Requirement already satisfied: safetensors>=0.3.1 in /usr/local/lib/python3.10/dist-packages (from accelerate==0.26.1->speechlib) (0.4.2)
Collecting av==11.* (from faster-whisper==0.10.0->speechlib)
  Downloading av-11.0.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (32.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 32.9/32.9 MB 1.6 MB/s eta 0:00:00
Collecting ctranslate2<5,>=4.0 (from faster-whisper==0.10.0->speechlib)
  Downloading ctranslate2-4.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (36.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.7/36.7 MB 2.3 MB/s eta 0:00:00
Requirement already satisfied: tokenizers<0.16,>=0.13 in /usr/local/lib/python3.10/dist-packages (from faster-whisper==0.10.0->speechlib) (0.15.2)
Collecting onnxruntime<2,>=1.14 (from faster-whisper==0.10.0->speechlib)
  Downloading onnxruntime-1.17.1-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.8/6.8 MB 2.7 MB/s eta 0:00:00
Collecting asteroid-filterbanks>=0.4 (from pyannote.audio==3.1.1->speechlib)
  Downloading asteroid_filterbanks-0.4.0-py3-none-any.whl (29 kB)
Collecting einops>=0.6.0 (from pyannote.audio==3.1.1->speechlib)
  Downloading einops-0.7.0-py3-none-any.whl (44 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.6/44.6 kB 2.3 MB/s eta 0:00:00
Collecting lightning>=2.0.1 (from pyannote.audio==3.1.1->speechlib)
  Downloading lightning-2.2.1-py3-none-any.whl (2.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.1/2.1 MB 2.6 MB/s eta 0:00:00
Collecting omegaconf<3.0,>=2.1 (from pyannote.audio==3.1.1->speechlib)
  Downloading omegaconf-2.3.0-py3-none-any.whl (79 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 79.5/79.5 kB 1.7 MB/s eta 0:00:00
Collecting pyannote.core>=5.0.0 (from pyannote.audio==3.1.1->speechlib)
  Downloading pyannote.core-5.0.0-py3-none-any.whl (58 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.5/58.5 kB 2.1 MB/s eta 0:00:00
Collecting pyannote.database>=5.0.1 (from pyannote.audio==3.1.1->speechlib)
  Downloading pyannote.database-5.0.1-py3-none-any.whl (48 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.1/48.1 kB 2.1 MB/s eta 0:00:00
Collecting pyannote.metrics>=3.2 (from pyannote.audio==3.1.1->speechlib)
  Downloading pyannote.metrics-3.2.1-py3-none-any.whl (51 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 51.4/51.4 kB 2.1 MB/s eta 0:00:00
Collecting pyannote.pipeline>=3.0.1 (from pyannote.audio==3.1.1->speechlib)
  Downloading pyannote.pipeline-3.0.1-py3-none-any.whl (31 kB)
Collecting pytorch-metric-learning>=2.1.0 (from pyannote.audio==3.1.1->speechlib)
  Downloading pytorch_metric_learning-2.4.1-py3-none-any.whl (118 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 118.6/118.6 kB 2.1 MB/s eta 0:00:00
Requirement already satisfied: rich>=12.0.0 in /usr/local/lib/python3.10/dist-packages (from pyannote.audio==3.1.1->speechlib) (13.7.1)
Collecting semver>=3.0.0 (from pyannote.audio==3.1.1->speechlib)
  Downloading semver-3.0.2-py3-none-any.whl (17 kB)
Requirement already satisfied: soundfile>=0.12.1 in /usr/local/lib/python3.10/dist-packages (from pyannote.audio==3.1.1->speechlib) (0.12.1)
Collecting tensorboardX>=2.6 (from pyannote.audio==3.1.1->speechlib)
  Downloading tensorboardX-2.6.2.2-py2.py3-none-any.whl (101 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 101.7/101.7 kB 2.0 MB/s eta 0:00:00
Collecting torch-audiomentations>=0.11.0 (from pyannote.audio==3.1.1->speechlib)
  Downloading torch_audiomentations-0.11.1-py3-none-any.whl (50 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 50.1/50.1 kB 2.1 MB/s eta 0:00:00
Collecting torchmetrics>=0.11.0 (from pyannote.audio==3.1.1->speechlib)
  Downloading torchmetrics-1.3.2-py3-none-any.whl (841 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 841.5/841.5 kB 2.3 MB/s eta 0:00:00
Collecting hyperpyyaml (from speechbrain==0.5.16->speechlib)
  Downloading HyperPyYAML-1.2.2-py3-none-any.whl (16 kB)
Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from speechbrain==0.5.16->speechlib) (1.3.2)
Requirement already satisfied: scipy in /usr/local/lib/python3.10/dist-packages (from speechbrain==0.5.16->speechlib) (1.11.4)
Requirement already satisfied: sentencepiece in /usr/local/lib/python3.10/dist-packages (from speechbrain==0.5.16->speechlib) (0.1.99)
Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from speechbrain==0.5.16->speechlib) (4.66.2)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch==2.1.2->speechlib) (3.13.3)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch==2.1.2->speechlib) (4.10.0)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch==2.1.2->speechlib) (1.12)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch==2.1.2->speechlib) (3.2.1)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch==2.1.2->speechlib) (3.1.3)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch==2.1.2->speechlib) (2023.6.0)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch==2.1.2->speechlib)
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 2.0 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch==2.1.2->speechlib)
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 kB 2.4 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch==2.1.2->speechlib)
  Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 2.3 MB/s eta 0:00:00
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch==2.1.2->speechlib)
  Downloading nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 731.7/731.7 MB 889.4 kB/s eta 0:00:00
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch==2.1.2->speechlib)
  Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 1.1 MB/s eta 0:00:00
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch==2.1.2->speechlib)
  Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 2.6 MB/s eta 0:00:00
Collecting nvidia-curand-cu12==10.3.2.106 (from torch==2.1.2->speechlib)
  Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 1.9 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu12==11.4.5.107 (from torch==2.1.2->speechlib)
  Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 2.5 MB/s eta 0:00:00
Collecting nvidia-cusparse-cu12==12.1.0.106 (from torch==2.1.2->speechlib)
  Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 2.8 MB/s eta 0:00:00
Collecting nvidia-nccl-cu12==2.18.1 (from torch==2.1.2->speechlib)
  Downloading nvidia_nccl_cu12-2.18.1-py3-none-manylinux1_x86_64.whl (209.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.8/209.8 MB 2.9 MB/s eta 0:00:00
Collecting nvidia-nvtx-cu12==12.1.105 (from torch==2.1.2->speechlib)
  Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 kB 5.6 MB/s eta 0:00:00
Collecting triton==2.1.0 (from torch==2.1.2->speechlib)
  Downloading triton-2.1.0-0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (89.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 89.2/89.2 MB 3.4 MB/s eta 0:00:00
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers==4.36.2->speechlib) (2023.12.25)
Requirement already satisfied: requests in /usr/local/lib/python3.10/dist-packages (from transformers==4.36.2->speechlib) (2.31.0)
Collecting nvidia-nvjitlink-cu12 (from nvidia-cusolver-cu12==11.4.5.107->torch==2.1.2->speechlib)
  Downloading nvidia_nvjitlink_cu12-12.4.99-py3-none-manylinux2014_x86_64.whl (21.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 21.1/21.1 MB 2.9 MB/s eta 0:00:00
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from ctranslate2<5,>=4.0->faster-whisper==0.10.0->speechlib) (67.7.2)
Collecting lightning-utilities<2.0,>=0.8.0 (from lightning>=2.0.1->pyannote.audio==3.1.1->speechlib)
  Downloading lightning_utilities-0.11.1-py3-none-any.whl (26 kB)
Collecting pytorch-lightning (from lightning>=2.0.1->pyannote.audio==3.1.1->speechlib)
  Downloading pytorch_lightning-2.2.1-py3-none-any.whl (801 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 801.6/801.6 kB 3.3 MB/s eta 0:00:00
Collecting antlr4-python3-runtime==4.9.* (from omegaconf<3.0,>=2.1->pyannote.audio==3.1.1->speechlib)
  Downloading antlr4-python3-runtime-4.9.3.tar.gz (117 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 117.0/117.0 kB 3.6 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Collecting coloredlogs (from onnxruntime<2,>=1.14->faster-whisper==0.10.0->speechlib)
  Downloading coloredlogs-15.0.1-py2.py3-none-any.whl (46 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 46.0/46.0 kB 2.9 MB/s eta 0:00:00
Requirement already satisfied: flatbuffers in /usr/local/lib/python3.10/dist-packages (from onnxruntime<2,>=1.14->faster-whisper==0.10.0->speechlib) (24.3.7)
Requirement already satisfied: protobuf in /usr/local/lib/python3.10/dist-packages (from onnxruntime<2,>=1.14->faster-whisper==0.10.0->speechlib) (3.20.3)
Requirement already satisfied: sortedcontainers>=2.0.4 in /usr/local/lib/python3.10/dist-packages (from pyannote.core>=5.0.0->pyannote.audio==3.1.1->speechlib) (2.4.0)
Requirement already satisfied: pandas>=0.19 in /usr/local/lib/python3.10/dist-packages (from pyannote.database>=5.0.1->pyannote.audio==3.1.1->speechlib) (1.5.3)
Requirement already satisfied: typer[all]>=0.2.1 in /usr/local/lib/python3.10/dist-packages (from pyannote.database>=5.0.1->pyannote.audio==3.1.1->speechlib) (0.9.4)
Requirement already satisfied: scikit-learn>=0.17.1 in /usr/local/lib/python3.10/dist-packages (from pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (1.2.2)
Collecting docopt>=0.6.2 (from pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib)
  Downloading docopt-0.6.2.tar.gz (25 kB)
  Preparing metadata (setup.py) ... done
Requirement already satisfied: tabulate>=0.7.7 in /usr/local/lib/python3.10/dist-packages (from pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (0.9.0)
Requirement already satisfied: matplotlib>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (3.7.1)
Collecting optuna>=3.1 (from pyannote.pipeline>=3.0.1->pyannote.audio==3.1.1->speechlib)
  Downloading optuna-3.6.0-py3-none-any.whl (379 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 379.9/379.9 kB 3.5 MB/s eta 0:00:00
Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich>=12.0.0->pyannote.audio==3.1.1->speechlib) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich>=12.0.0->pyannote.audio==3.1.1->speechlib) (2.16.1)
Requirement already satisfied: cffi>=1.0 in /usr/local/lib/python3.10/dist-packages (from soundfile>=0.12.1->pyannote.audio==3.1.1->speechlib) (1.16.0)
Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch==2.1.2->speechlib) (1.3.0)
Collecting julius<0.3,>=0.2.3 (from torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib)
  Downloading julius-0.2.7.tar.gz (59 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 59.6/59.6 kB 3.3 MB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Requirement already satisfied: librosa>=0.6.0 in /usr/local/lib/python3.10/dist-packages (from torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib) (0.10.1)
Collecting torch-pitch-shift>=1.2.2 (from torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib)
  Downloading torch_pitch_shift-1.2.4-py3-none-any.whl (4.9 kB)
Collecting ruamel.yaml>=0.17.28 (from hyperpyyaml->speechbrain==0.5.16->speechlib)
  Downloading ruamel.yaml-0.18.6-py3-none-any.whl (117 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 117.8/117.8 kB 3.7 MB/s eta 0:00:00
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch==2.1.2->speechlib) (2.1.5)
Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==4.36.2->speechlib) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==4.36.2->speechlib) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==4.36.2->speechlib) (2.0.7)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests->transformers==4.36.2->speechlib) (2024.2.2)
Requirement already satisfied: pycparser in /usr/local/lib/python3.10/dist-packages (from cffi>=1.0->soundfile>=0.12.1->pyannote.audio==3.1.1->speechlib) (2.21)
Requirement already satisfied: aiohttp!=4.0.0a0,!=4.0.0a1 in /usr/local/lib/python3.10/dist-packages (from fsspec->torch==2.1.2->speechlib) (3.9.3)
Requirement already satisfied: audioread>=2.1.9 in /usr/local/lib/python3.10/dist-packages (from librosa>=0.6.0->torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib) (3.0.1)
Requirement already satisfied: decorator>=4.3.0 in /usr/local/lib/python3.10/dist-packages (from librosa>=0.6.0->torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib) (4.4.2)
Requirement already satisfied: numba>=0.51.0 in /usr/local/lib/python3.10/dist-packages (from librosa>=0.6.0->torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib) (0.58.1)
Requirement already satisfied: pooch>=1.0 in /usr/local/lib/python3.10/dist-packages (from librosa>=0.6.0->torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib) (1.8.1)
Requirement already satisfied: soxr>=0.3.2 in /usr/local/lib/python3.10/dist-packages (from librosa>=0.6.0->torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib) (0.3.7)
Requirement already satisfied: lazy-loader>=0.1 in /usr/local/lib/python3.10/dist-packages (from librosa>=0.6.0->torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib) (0.3)
Requirement already satisfied: msgpack>=1.0 in /usr/local/lib/python3.10/dist-packages (from librosa>=0.6.0->torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib) (1.0.8)
Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich>=12.0.0->pyannote.audio==3.1.1->speechlib) (0.1.2)
Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=2.0.0->pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (1.2.0)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=2.0.0->pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (0.12.1)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=2.0.0->pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (4.50.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=2.0.0->pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (1.4.5)
Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=2.0.0->pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (9.4.0)
Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=2.0.0->pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (3.1.2)
Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib>=2.0.0->pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (2.8.2)
Collecting alembic>=1.5.0 (from optuna>=3.1->pyannote.pipeline>=3.0.1->pyannote.audio==3.1.1->speechlib)
  Downloading alembic-1.13.1-py3-none-any.whl (233 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 233.4/233.4 kB 3.6 MB/s eta 0:00:00
Collecting colorlog (from optuna>=3.1->pyannote.pipeline>=3.0.1->pyannote.audio==3.1.1->speechlib)
  Downloading colorlog-6.8.2-py3-none-any.whl (11 kB)
Requirement already satisfied: sqlalchemy>=1.3.0 in /usr/local/lib/python3.10/dist-packages (from optuna>=3.1->pyannote.pipeline>=3.0.1->pyannote.audio==3.1.1->speechlib) (2.0.29)
Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=0.19->pyannote.database>=5.0.1->pyannote.audio==3.1.1->speechlib) (2023.4)
Collecting ruamel.yaml.clib>=0.2.7 (from ruamel.yaml>=0.17.28->hyperpyyaml->speechbrain==0.5.16->speechlib)
  Downloading ruamel.yaml.clib-0.2.8-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (526 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 526.7/526.7 kB 3.5 MB/s eta 0:00:00
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=0.17.1->pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (3.4.0)
Collecting primePy>=1.3 (from torch-pitch-shift>=1.2.2->torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib)
  Downloading primePy-1.3-py3-none-any.whl (4.0 kB)
Requirement already satisfied: click<9.0.0,>=7.1.1 in /usr/local/lib/python3.10/dist-packages (from typer[all]>=0.2.1->pyannote.database>=5.0.1->pyannote.audio==3.1.1->speechlib) (8.1.7)
Collecting colorama<0.5.0,>=0.4.3 (from typer[all]>=0.2.1->pyannote.database>=5.0.1->pyannote.audio==3.1.1->speechlib)
  Downloading colorama-0.4.6-py2.py3-none-any.whl (25 kB)
Collecting shellingham<2.0.0,>=1.3.0 (from typer[all]>=0.2.1->pyannote.database>=5.0.1->pyannote.audio==3.1.1->speechlib)
  Downloading shellingham-1.5.4-py2.py3-none-any.whl (9.8 kB)
Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime<2,>=1.14->faster-whisper==0.10.0->speechlib)
  Downloading humanfriendly-10.0-py2.py3-none-any.whl (86 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 86.8/86.8 kB 3.4 MB/s eta 0:00:00
Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec->torch==2.1.2->speechlib) (1.3.1)
Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec->torch==2.1.2->speechlib) (23.2.0)
Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec->torch==2.1.2->speechlib) (1.4.1)
Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec->torch==2.1.2->speechlib) (6.0.5)
Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec->torch==2.1.2->speechlib) (1.9.4)
Requirement already satisfied: async-timeout<5.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp!=4.0.0a0,!=4.0.0a1->fsspec->torch==2.1.2->speechlib) (4.0.3)
Collecting Mako (from alembic>=1.5.0->optuna>=3.1->pyannote.pipeline>=3.0.1->pyannote.audio==3.1.1->speechlib)
  Downloading Mako-1.3.2-py3-none-any.whl (78 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.7/78.7 kB 3.5 MB/s eta 0:00:00
Requirement already satisfied: llvmlite<0.42,>=0.41.0dev0 in /usr/local/lib/python3.10/dist-packages (from numba>=0.51.0->librosa>=0.6.0->torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib) (0.41.1)
Requirement already satisfied: platformdirs>=2.5.0 in /usr/local/lib/python3.10/dist-packages (from pooch>=1.0->librosa>=0.6.0->torch-audiomentations>=0.11.0->pyannote.audio==3.1.1->speechlib) (4.2.0)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib>=2.0.0->pyannote.metrics>=3.2->pyannote.audio==3.1.1->speechlib) (1.16.0)
Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from sqlalchemy>=1.3.0->optuna>=3.1->pyannote.pipeline>=3.0.1->pyannote.audio==3.1.1->speechlib) (3.0.3)
Building wheels for collected packages: antlr4-python3-runtime, docopt, julius
  Building wheel for antlr4-python3-runtime (setup.py) ... done
  Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.9.3-py3-none-any.whl size=144554 sha256=1791822f9b8fe2d72328adf46168b4d07ec740a89ea5ea6645cb42ca81d80c45
  Stored in directory: /root/.cache/pip/wheels/12/93/dd/1f6a127edc45659556564c5730f6d4e300888f4bca2d4c5a88
  Building wheel for docopt (setup.py) ... done
  Created wheel for docopt: filename=docopt-0.6.2-py2.py3-none-any.whl size=13706 sha256=b94f84010b5464b0fa317849eae31e48f3d27747a1613ec717c38bfc8740f963
  Stored in directory: /root/.cache/pip/wheels/fc/ab/d4/5da2067ac95b36618c629a5f93f809425700506f72c9732fac
  Building wheel for julius (setup.py) ... done
  Created wheel for julius: filename=julius-0.2.7-py3-none-any.whl size=21870 sha256=834ab867d5b28e6fc5bfa58ea7ce33c79fdf01a4939614e7f198dd6b56d98252
  Stored in directory: /root/.cache/pip/wheels/b9/b2/05/f883527ffcb7f2ead5438a2c23439aa0c881eaa9a4c80256f4
Successfully built antlr4-python3-runtime docopt julius
Installing collected packages: pydub, primePy, docopt, antlr4-python3-runtime, triton, tensorboardX, shellingham, semver, ruamel.yaml.clib, omegaconf, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, Mako, lightning-utilities, humanfriendly, einops, ctranslate2, colorlog, colorama, av, ruamel.yaml, pyannote.core, nvidia-cusparse-cu12, nvidia-cudnn-cu12, coloredlogs, alembic, optuna, onnxruntime, nvidia-cusolver-cu12, hyperpyyaml, transformers, torch, pyannote.database, faster-whisper, torchmetrics, torchaudio, pytorch-metric-learning, pyannote.pipeline, pyannote.metrics, julius, asteroid-filterbanks, accelerate, torch-pitch-shift, speechbrain, pytorch-lightning, torch-audiomentations, lightning, pyannote.audio, speechlib
  Attempting uninstall: triton
    Found existing installation: triton 2.2.0
    Uninstalling triton-2.2.0:
      Successfully uninstalled triton-2.2.0
  Attempting uninstall: transformers
    Found existing installation: transformers 4.38.2
    Uninstalling transformers-4.38.2:
      Successfully uninstalled transformers-4.38.2
  Attempting uninstall: torch
    Found existing installation: torch 2.2.1+cu121
    Uninstalling torch-2.2.1+cu121:
      Successfully uninstalled torch-2.2.1+cu121
  Attempting uninstall: torchaudio
    Found existing installation: torchaudio 2.2.1+cu121
    Uninstalling torchaudio-2.2.1+cu121:
      Successfully uninstalled torchaudio-2.2.1+cu121
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchtext 0.17.1 requires torch==2.2.1, but you have torch 2.1.2 which is incompatible.
torchvision 0.17.1+cu121 requires torch==2.2.1, but you have torch 2.1.2 which is incompatible.
Successfully installed Mako-1.3.2 accelerate-0.26.1 alembic-1.13.1 antlr4-python3-runtime-4.9.3 asteroid-filterbanks-0.4.0 av-11.0.0 colorama-0.4.6 coloredlogs-15.0.1 colorlog-6.8.2 ctranslate2-4.1.0 docopt-0.6.2 einops-0.7.0 faster-whisper-0.10.0 humanfriendly-10.0 hyperpyyaml-1.2.2 julius-0.2.7 lightning-2.2.1 lightning-utilities-0.11.1 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-8.9.2.26 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.18.1 nvidia-nvjitlink-cu12-12.4.99 nvidia-nvtx-cu12-12.1.105 omegaconf-2.3.0 onnxruntime-1.17.1 optuna-3.6.0 primePy-1.3 pyannote.audio-3.1.1 pyannote.core-5.0.0 pyannote.database-5.0.1 pyannote.metrics-3.2.1 pyannote.pipeline-3.0.1 pydub-0.25.1 pytorch-lightning-2.2.1 pytorch-metric-learning-2.4.1 ruamel.yaml-0.18.6 ruamel.yaml.clib-0.2.8 semver-3.0.2 shellingham-1.5.4 speechbrain-0.5.16 speechlib-1.0.13 tensorboardX-2.6.2.2 torch-2.1.2 torch-audiomentations-0.11.1 torch-pitch-shift-1.2.4 torchaudio-2.1.2 torchmetrics-1.3.2 transformers-4.36.2 triton-2.1.0
WARNING: The following packages were previously imported in this runtime:
  [pydevd_plugins]
You must restart the runtime in order to use newly installed versions.
Requirement already satisfied: pathlib in /usr/local/lib/python3.10/dist-packages (1.0.1)
Collecting pytube
  Downloading pytube-15.0.0-py3-none-any.whl (57 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.6/57.6 kB 980.2 kB/s eta 0:00:00
Installing collected packages: pytube
Successfully installed pytube-15.0.0
gety9 commented 6 months ago
from speechlib import Transcriptor

afl = str(adir) + '/' + stm + '.mp3'
vcd = ''
lng = 'en'
logd = str(GDDIR) + '/' + 'logs'
mdlsz = 'large'
qntzt = False

transcriptor = Transcriptor(afl, logd, lng, mdlsz, vcd, qntzt)

res = transcriptor.transcribe()

---------------------------------------------------------------------------
ContextualVersionConflict                 Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/lightning_utilities/core/imports.py](https://localhost:8080/#) in _check_requirement(self)
    131             # first try the pkg_resources requirement
--> 132             pkg_resources.require(self.requirement)
    133             self.available = True

27 frames
ContextualVersionConflict: (torch 2.1.2 (/usr/local/lib/python3.10/dist-packages), Requirement.parse('torch==2.2.1'), {'torchvision'})

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/torch/_custom_op/impl.py](https://localhost:8080/#) in error_not_found()
   1050 def get_op(qualname):
   1051     def error_not_found():
-> 1052         raise ValueError(
   1053             f"Could not find the operator {qualname}. Please make sure you have "
   1054             f"already registered the operator and (if registered from C++) "

ValueError: Could not find the operator torchvision::nms. Please make sure you have already registered the operator and (if registered from C++) loaded it via torch.ops.load_library.

I tried upgrading torch, then after i need to upgrade torchaudio. And it leads to yet another error...

gety9 commented 6 months ago

With !pip install speechlib torchvision==0.13.1 (solution suggested by colab bot)

getting following error

from speechlib import Transcriptor


afl = str(adir) + '/' + stm + '.mp3'
vcd = ''
lng = 'en'
logd = str(GDDIR) + '/' + 'logs'
mdlsz = 'large'
qntzt = False

transcriptor = Transcriptor(afl, logd, lng, mdlsz, vcd, qntzt)

res = transcriptor.transcribe()
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
[<ipython-input-2-bb75dc4f1c63>](https://localhost:8080/#) in <cell line: 1>()
----> 1 from speechlib import Transcriptor
      2 
      3 afl = str(adir) + '/' + stm + '.mp3'
      4 vcd = ''
      5 lng = 'en'

18 frames
[/usr/lib/python3.10/ctypes/__init__.py](https://localhost:8080/#) in __init__(self, name, mode, handle, use_errno, use_last_error, winmode)
    372 
    373         if handle is None:
--> 374             self._handle = _dlopen(self._name, mode)
    375         else:
    376             self._handle = handle

OSError: /usr/local/lib/python3.10/dist-packages/torchtext/lib/libtorchtext.so: undefined symbol: _ZN5torch3jit21setUTF8DecodingIgnoreEb
NavodPeiris commented 6 months ago

image

code does not have to change. colab comes with pre-installed packages and you have to restart runtime to use newly installed packages. then everything works fine.

Lels07 commented 6 months ago

Hi, using the speechlib_run notebook exemple i got this after the runtime restart.

AttributeError                            Traceback (most recent call last)
[<ipython-input-4-ab51f10e0462>](https://if67c8k0o8n-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240326-060120_RC00_619136293#) in <cell line: 1>()
----> 1 from speechlib import Transcriptor
      2 
      3 file = "obama_zach.wav"   # replace with your own
      4 voices_folder = "voices"  # replace with your own
      5 language = "en"

12 frames
[/usr/local/lib/python3.10/dist-packages/torchvision/_meta_registrations.py](https://if67c8k0o8n-496ff2e9c6d22116-0-colab.googleusercontent.com/outputframe.html?vrz=colab_20240326-060120_RC00_619136293#) in wrapper(fn)
     16 def register_meta(op_name, overload_name="default"):
     17     def wrapper(fn):
---> 18         if torchvision.extension._has_ops():
     19             get_meta_lib().impl(getattr(getattr(torch.ops.torchvision, op_name), overload_name), fn)
     20         return fn

AttributeError: partially initialized module 'torchvision' has no attribute 'extension' (most likely due to a circular import)
NavodPeiris commented 6 months ago

image

i have added a cell to delete torch, torchvision, torchaudio, torchdata, torchsummary, torchtext. speechlib is installing torch and pre-installed packages will not conflict with it anymore.

you still have to restart after installing packages.