Assertion error when trying to run with a transcription model

Description of bug / unexpected behavior

After installing packages required to run a transcription model it throws an assertion error when trying to use it

Expected behavior

The transcription model should run fine

How to reproduce the issue

Code for reproducing the problem

```py from manim import * from manim_voiceover import VoiceoverScene from manim_voiceover.services.gtts import GTTSService class BugScene(VoiceoverScene): def construct(self): self.set_speech_service( GTTSService(transcription_model="base") ) with self.voiceover("Voice") as trk: pass ```

Additional media files

Images/GIFs

Logs

Terminal output

``` (venv) oz@Ozz:~/repos/GPU_Programming$ manim -pql manim_scripts/temp.py -v DEBUG Manim Community v0.18.1 Detected language: english 0%| | 0/0.96 [00:00 0: │ │ 574 │ │ │ │ │ num_samples = min(round(end_timestamp_pos * N_SAMPLES_PER_TOKEN), nu │ │ ❱ 575 │ │ │ │ add_word_timestamps_stable( │ │ 576 │ │ │ │ │ segments=current_segments, │ │ 577 │ │ │ │ │ model=model, │ │ 578 │ │ │ │ │ tokenizer=tokenizer, │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/stable_whisper/timing.py:259 in │ │ add_word_timestamps_stable │ │ │ │ 256 │ │ │ │ │ ) │ │ 257 │ │ │ │ ) │ │ 258 │ │ │ ❱ 259 │ align() │ │ 260 │ if ( │ │ 261 │ │ │ gap_padding is not None and │ │ 262 │ │ │ any( │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/stable_whisper/timing.py:225 in │ │ align │ │ │ │ 222 │ │ text_tokens, token_split, seg_indices = split_word_tokens(segments, tokenizer, │ │ 223 │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ padding=gap_padding, s │ │ 224 │ │ │ │ ❱ 225 │ │ alignment = find_alignment_stable(model, tokenizer, text_tokens, mel, num_sample │ │ 226 │ │ │ │ │ │ │ │ │ │ **kwargs, │ │ 227 │ │ │ │ │ │ │ │ │ │ token_split=token_split, │ │ 228 │ │ │ │ │ │ │ │ │ │ audio_features=audio_features, │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/stable_whisper/timing.py:79 in │ │ find_alignment_stable │ │ │ │ 76 │ weights = (weights * qk_scale).softmax(dim=-1) │ │ 77 │ std, mean = torch.std_mean(weights, dim=-2, keepdim=True, unbiased=False) │ │ 78 │ weights = (weights - mean) / std │ │ ❱ 79 │ weights = median_filter(weights, medfilt_width) │ │ 80 │ │ │ 81 │ matrix = weights.mean(axis=0) │ │ 82 │ matrix = matrix[len(tokenizer.sot_sequence): -1] │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/whisper/timing.py:38 in │ │ median_filter │ │ │ │ 35 │ x = F.pad(x, (filter_width // 2, filter_width // 2, 0, 0), mode="reflect") │ │ 36 │ if x.is_cuda: │ │ 37 │ │ try: │ │ ❱ 38 │ │ │ from .triton_ops import median_filter_cuda │ │ 39 │ │ │ │ │ 40 │ │ │ result = median_filter_cuda(x, filter_width) │ │ 41 │ │ except (RuntimeError, subprocess.CalledProcessError): │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/whisper/triton_ops.py:7 in │ │ │ │ │ │ 4 import torch │ │ 5 │ │ 6 try: │ │ ❱ 7 │ import triton │ │ 8 │ import triton.language as tl │ │ 9 except ImportError: │ │ 10 │ raise RuntimeError("triton import failed; try `pip install --pre triton`") │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/triton/__init__.py:20 in │ │ │ │ │ │ 17 │ reinterpret, │ │ 18 │ TensorWrapper, │ │ 19 ) │ │ ❱ 20 from .runtime import ( │ │ 21 │ autotune, │ │ 22 │ Config, │ │ 23 │ heuristics, │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/triton/runtime/__init__.py:1 in │ │ │ │ │ │ ❱ 1 from .autotuner import Config, Heuristics, autotune, heuristics │ │ 2 from .jit import JITFunction, KernelInterface, version_key │ │ 3 │ │ 4 __all__ = [ │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/triton/runtime/autotuner.py:7 │ │ in │ │ │ │ 4 import time │ │ 5 from typing import Dict │ │ 6 │ │ ❱ 7 from ..compiler import OutOfResources │ │ 8 from ..testing import do_bench │ │ 9 from .jit import KernelInterface │ │ 10 │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/triton/compiler.py:22 in │ │ │ │ │ │ 19 from sysconfig import get_paths │ │ 20 from typing import Any, Callable, Dict, Tuple, Union │ │ 21 │ │ ❱ 22 import setuptools │ │ 23 import torch │ │ 24 from filelock import FileLock │ │ 25 │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/setuptools/__init__.py:8 in │ │ │ │ │ │ 5 import re │ │ 6 import warnings │ │ 7 │ │ ❱ 8 import _distutils_hack.override # noqa: F401 │ │ 9 │ │ 10 import distutils.core │ │ 11 from distutils.errors import DistutilsOptionError │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/_distutils_hack/override.py:1 │ │ in │ │ │ │ ❱ 1 __import__('_distutils_hack').do_override() │ │ 2 │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/_distutils_hack/__init__.py:77 │ │ in do_override │ │ │ │ 74 │ """ │ │ 75 │ if enabled(): │ │ 76 │ │ warn_distutils_present() │ │ ❱ 77 │ │ ensure_local_distutils() │ │ 78 │ │ 79 │ │ 80 class _TrivialRe: │ │ │ │ /home/oz/repos/GPU_Programming/venv/lib/python3.11/site-packages/_distutils_hack/__init__.py:64 │ │ in ensure_local_distutils │ │ │ │ 61 │ │ │ 62 │ # check that submodules load as expected │ │ 63 │ core = importlib.import_module('distutils.core') │ │ ❱ 64 │ assert '_distutils' in core.__file__, core.__file__ │ │ 65 │ assert 'setuptools._distutils.log' not in sys.modules │ │ 66 │ │ 67 │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ AssertionError: /usr/lib/python3.11/distutils/core.py ```

System specifications

System Details

- OS (with version, e.g., Windows 10 v2004 or macOS 10.15 (Catalina)): - RAM: - Python version (`python/py/python3 --version`): - Installed modules (provide output from `pip list`): ``` Debian 12 kernel 6.1.0-22-amd64 ram: 64 GB DDR5 Python 3.11.2 Pip: Package Version ------------------------ ----------- attrs 23.2.0 basedpyright 1.13.3 cattrs 23.2.3 certifi 2024.7.4 charset-normalizer 3.3.2 click 8.1.7 cloup 3.0.5 cmake 3.30.1 decorator 5.1.1 docstring-to-markdown 0.15 evdev 1.7.1 ffmpeg-python 0.2.0 filelock 3.15.4 fsspec 2024.6.1 future 1.0.0 glcontext 2.5.0 gTTS 2.5.1 huggingface-hub 0.24.1 idna 3.7 isosurfaces 0.1.2 jedi 0.19.1 jedi-language-server 0.41.4 Jinja2 3.1.4 lit 18.1.8 llvmlite 0.43.0 lsprotocol 2023.0.1 manim 0.18.1 manim-ml 0.0.24 manim-voiceover 0.3.6.post0 ManimPango 0.5.0 mapbox-earcut 1.0.1 markdown-it-py 3.0.0 MarkupSafe 2.1.5 mdurl 0.1.2 moderngl 5.10.0 moderngl-window 2.4.6 more-itertools 10.3.0 mpmath 1.3.0 multipledispatch 1.0.0 mutagen 1.47.0 networkx 3.3 nodejs-wheel-binaries 20.15.1 numba 0.60.0 numpy 1.26.4 nvidia-cublas-cu11 11.10.3.66 nvidia-cuda-cupti-cu11 11.7.101 nvidia-cuda-nvrtc-cu11 11.7.99 nvidia-cuda-runtime-cu11 11.7.99 nvidia-cudnn-cu11 8.5.0.96 nvidia-cufft-cu11 10.9.0.58 nvidia-curand-cu11 10.2.10.91 nvidia-cusolver-cu11 11.4.0.1 nvidia-cusparse-cu11 11.7.4.91 nvidia-nccl-cu11 2.14.3 nvidia-nvtx-cu11 11.7.91 openai-whisper 20230314 packaging 24.1 pandas 2.2.2 parso 0.8.4 pillow 10.4.0 pip 23.0.1 PyAudio 0.2.14 pycairo 1.26.1 pydub 0.25.1 pyglet 2.0.15 pygls 1.3.1 Pygments 2.18.0 pynput 1.7.7 pyrr 0.10.3 python-dateutil 2.9.0.post0 python-dotenv 0.21.1 python-slugify 8.0.4 python-xlib 0.33 pytz 2024.1 PyYAML 6.0.1 regex 2024.5.15 requests 2.32.3 rich 13.7.1 safetensors 0.4.3 scipy 1.14.0 screeninfo 0.8.1 setuptools 66.1.1 six 1.16.0 skia-pathops 0.8.0.post1 sox 1.5.0 srt 3.5.3 stable-ts 2.11.1 svgelements 1.9.6 sympy 1.13.1 text-unidecode 1.3 tiktoken 0.3.1 tokenizers 0.19.1 torch 2.0.1 torchaudio 2.0.2 tqdm 4.66.4 transformers 4.43.1 triton 2.0.0 typing_extensions 4.12.2 tzdata 2024.1 urllib3 2.2.2 watchdog 4.0.1 wheel 0.43.0 ```

LaTeX details

+ LaTeX distribution (e.g. TeX Live 2020): + Installed LaTeX packages:

FFMPEG

Output of `ffmpeg -version`: ``` PASTE HERE ```

ManimCommunity / manim-voiceover