Transcripts4All / tools4all

A curated collection of tools to aid transcriptionists and subtitlers.
https://transcripts4all.github.io
16 stars 0 forks source link

whisper-diarization errors out #2

Closed BlohoJo closed 1 month ago

BlohoJo commented 1 month ago

I'm running at the defaults. I restarted the session. I'm using a m4a file.

I cannot post the log here, Github doesn't allow it because it makes the comment too long (65536 characters).

Using pastebin instead for Step 1: https://pastebin.com/zjjyhQuK

Step 3:

2024-09-25 01:32:14.792700: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-25 01:32:14.825781: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-25 01:32:14.836089: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-09-25 01:32:16.429366: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/local/lib/python3.10/dist-packages/pyannote/audio/core/io.py:43: UserWarning: torchaudio._backend.set_audio_backend has been deprecated. With dispatcher enabled, this function is no-op. You can remove the function call.
  torchaudio.set_audio_backend("soundfile")
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.10/dist-packages/demucs/separate.py", line 11, in <module>
    from dora.log import fatal
ModuleNotFoundError: No module named 'dora'
WARNING:root:Source splitting failed, using original audio file. Use --no-stem argument to disable it.
model.bin:   0% 10.5M/3.09G [00:00<00:35, 86.3MB/s]
config.json: 100% 2.39k/2.39k [00:00<00:00, 8.94MB/s]
model.bin:   1% 31.5M/3.09G [00:00<00:24, 124MB/s] 
preprocessor_config.json: 100% 340/340 [00:00<00:00, 1.64MB/s]

vocabulary.json:   0% 0.00/1.07M [00:00<?, ?B/s]

model.bin:   6% 189M/3.09G [00:00<00:12, 240MB/s]
vocabulary.json: 100% 1.07M/1.07M [00:00<00:00, 1.43MB/s]

tokenizer.json: 100% 2.48M/2.48M [00:00<00:00, 3.14MB/s]
model.bin: 100% 3.09G/3.09G [00:24<00:00, 128MB/s]
No language specified, language will be first be detected for each audio file (increases inference time).
100%|█████████████████████████████████████| 16.9M/16.9M [00:02<00:00, 8.21MiB/s]
Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.7. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file ../../root/.cache/torch/whisperx-vad-segmentation.bin`
Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.4.1+cu121. Bad things might happen unless you revert torch to 1.x.
Detected language: en (1.00) in first 30s of audio...
config.json: 100% 2.08k/2.08k [00:00<00:00, 11.0MB/s]
model.safetensors: 100% 1.26G/1.26G [00:07<00:00, 158MB/s]
tokenizer_config.json: 100% 1.05k/1.05k [00:00<00:00, 7.35MB/s]
vocab.json: 100% 286/286 [00:00<00:00, 1.81MB/s]
special_tokens_map.json: 100% 74.0/74.0 [00:00<00:00, 445kB/s]
Traceback (most recent call last):
  File "/content/whisper-diarization/diarize_parallel.py", line 166, in <module>
    assert nemo_return_code == 0, (
AssertionError: Diarization failed with the following error:
Traceback (most recent call last):
  File "/content/whisper-diarization/nemo_process.py", line 5, in <module>
    from nemo.collections.asr.models.msdd_models import NeuralDiarizer
  File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/__init__.py", line 15, in <module>
    from nemo.collections.asr import data, losses, models, modules
  File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/losses/__init__.py", line 15, in <module>
    from nemo.collections.asr.losses.angularloss import AngularSoftmaxLoss
  File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/losses/angularloss.py", line 18, in <module>
    from nemo.core.classes import Loss, Typing, typecheck
  File "/usr/local/lib/python3.10/dist-packages/nemo/core/__init__.py", line 16, in <module>
    from nemo.core.classes import *
  File "/usr/local/lib/python3.10/dist-packages/nemo/core/classes/__init__.py", line 20, in <module>
    from nemo.core.classes.common import (
  File "/usr/local/lib/python3.10/dist-packages/nemo/core/classes/common.py", line 31, in <module>
    from huggingface_hub import HfApi, HfFolder, ModelFilter, hf_hub_download
ImportError: cannot import name 'ModelFilter' from 'huggingface_hub' (/usr/local/lib/python3.10/dist-packages/huggingface_hub/__init__.py)

    zip warning: name not matched: /content/audio_sample.srt
    zip warning: name not matched: /content/audio_sample.txt

zip error: Nothing to do! (/content/audio_sample.zip)
rm: cannot remove '/content/audio_sample.srt': No such file or directory
rm: cannot remove '/content/audio_sample.txt': No such file or directory
ScriptTiger commented 1 month ago

Thank you for reporting this! It looks like this has probably been broken since last month, after a couple of the dependencies updated, causing some dependency hell issues. There's also a lot of refactoring going on with the upstream repo, so a lot of activity recently. This repo usually takes care of itself pretty well, so I don't usually follow it too closely. But I am now, and will apply updates as they come in.

Thanks again, and I will be keeping this issue open until things get resolved.

lmmentel commented 1 month ago

Any progress planned on this?

ScriptTiger commented 1 month ago

The last news I heard was the maintainer was working on a major refactoring at the same time all of these dependency updates happened, so he's being flooded with issues. I believe he's going to try and weave in the necessary upgrades to work together with the new dependency updates while finishing the refactoring, so it might take a bit. In the meantime, for everyone using this repo, it might be best to just use the regular Whisper notebook for now.

If you have email notifications set up for GitHub repos you are watching, you could also follow the upstream repo here: https://github.com/MahmoudAshraf97/whisper-diarization

I've made a few comments myself on relevant issues there, but I haven't opened any new issues since he's already being inundated. So, mostly just following the progress and dropping some comments here and there.

ScriptTiger commented 1 month ago

The upstream repo just resolved these issues, and I just updated the local notebook to reflect the necessary changes.

Thanks, again, for bringing this to my attention!