MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.75k stars 329 forks source link

When I run whisper-diarization I get a "Kernel Died" in my terminal #261

Closed kirahman2 closed 3 weeks ago

kirahman2 commented 4 weeks ago

I am running Ubuntu 20.04 with CUDA 12.1. I've been troubleshooting this issue for just over a week and I think I figured out the issue.

When I run the notebook cell

compute_type = "float16"
# or run on GPU with INT8
# compute_type = "int8_float16"
# or run on CPU with INT8
# compute_type = "int8"

whisper_results, language, audio_waveform = transcribe_batched(
    vocal_target,
    language,
    batch_size,
    whisper_model_name,
    compute_type,
    suppress_numerals,
    device,
)

I get this output in my terminal

[I 2024-10-24 16:22:13.755 ServerApp] Connecting to kernel 276ac32a-c4b4-4a8a-8d5d-7340241f72c1.
[I 2024-10-24 16:22:13.755 ServerApp] Restoring connection for 276ac32a-c4b4-4a8a-8d5d-7340241f72c1:ab11835f-9f9f-41c9-859b-1e9617cc1ca3
[I 2024-10-24 16:22:25.701 ServerApp] AsyncIOLoopKernelRestarter: restarting kernel (1/5), keep random ports
[W 2024-10-24 16:22:25.701 ServerApp] kernel 276ac32a-c4b4-4a8a-8d5d-7340241f72c1 restarted
[I 2024-10-24 16:22:25.741 ServerApp] Starting buffering for 276ac32a-c4b4-4a8a-8d5d-7340241f72c1:ab11835f-9f9f-41c9-859b-1e9617cc1ca3
[I 2024-10-24 16:22:25.771 ServerApp] Connecting to kernel 276ac32a-c4b4-4a8a-8d5d-7340241f72c1.
[I 2024-10-24 16:22:25.772 ServerApp] Restoring connection for 276ac32a-c4b4-4a8a-8d5d-7340241f72c1:ab11835f-9f9f-41c9-859b-1e9617cc1ca3

I recalled having this issue when I was using WSL 2 Ubuntu 22.04. I resolved the issue by installing CUDA 11.7.

Since then I stopped using WSL and installed Ubuntu 20.04 directly to my workstation. Now I'm seeing the same "kernel died" error. I isolated the issue to faster whisper within Whisper_Transcription_+_NeMo_Diarization.ipynb, specifically in the function

def transcribe_batched(
    audio_file: str,
    language: str,
    batch_size: int,
    model_name: str,
    compute_dtype: str,
    suppress_numerals: bool,
    device: str,
):
    import whisperx

    # Faster Whisper batched
    whisper_model = whisperx.load_model(
        model_name,
        device,
        compute_type=compute_dtype,
        asr_options={"suppress_numerals": suppress_numerals},
    )
    audio = whisperx.load_audio(audio_file)
    result = whisper_model.transcribe(audio, language=language, batch_size=batch_size)
    del whisper_model
    torch.cuda.empty_cache()
    return result["segments"], result["language"], audio

I created a new python environment and installed faster whisper using the read me doc in https://github.com/SYSTRAN/faster-whisper. I used the code example from the readme and ran into the same "kernel died" error.

from faster_whisper import WhisperModel

model_size = "large-v3"

# Run on GPU with FP16
model = WhisperModel(model_size, device="cuda", compute_type="float16")

# or run on GPU with INT8
# model = WhisperModel(model_size, device="cuda", compute_type="int8_float16")
# or run on CPU with INT8
# model = WhisperModel(model_size, device="cpu", compute_type="int8")

segments, info = model.transcribe("audio.mp3", beam_size=5)

print("Detected language '%s' with probability %f" % (info.language, info.language_probability))

for segment in segments:
    print("[%.2fs -> %.2fs] %s" % (segment.start, segment.end, segment.text))

I then ran the command to install CUDA 11.6

pip install torch==1.13.1+cu116 torchvision==0.14.1+cu116 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu116

The CUDA 11.6 installation fixed the "kernel died" error. If I try to install any CUDA version that is newer than this, the kernel dies. I'm posting here because the whisper-diarization repo requires torch >= 2.0.0 and torch 2.0.0 can only be ran with the CUDA 11.7 installation command below.

pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1

How would I go about getting this to work with the whisper-diarization repo? Is there an older version of the repo that supports torch==1.13.1+cu116 that I can use?

MahmoudAshraf97 commented 4 weeks ago

Check #260

MahmoudAshraf97 commented 4 weeks ago

Is it necessary to use torch 1.13? Anyways, there is a conflict between ctranslate2 and pytorch, please check faster-whisper repo, in the last issue you'll find a compatibility matrix, try and let me know how it goes

MahmoudAshraf97 commented 4 weeks ago

https://github.com/SYSTRAN/faster-whisper/issues/1086

kirahman2 commented 4 weeks ago

Is it necessary to use torch 1.13? Anyways, there is a conflict between ctranslate2 and pytorch, please check faster-whisper repo, in the last issue you'll find a compatibility matrix, try and let me know how it goes

@MahmoudAshraf97 torch 1.13 is required actually. Cuda 11.6 fixes the issue where the kernel die issue for whisperx. It's installed with the command below. Cuda 11.6 can't be installed with torch 2.0.0 due to the PyTorch library compatability issues. The below commands are from https://pytorch.org/get-started/previous-versions/

CUDA 11.6 This command fixed the issue with the kernel dying. pip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu116

If I run this command against the python environment created by the whisper-diarization repo, there are conflicts. Here's my shell output.

khalid@khalid-Precision-7920-Tower:~$ source whisper_env3/bin/activate
(whisper_env3) khalid@khalid-Precision-7920-Tower:~$ pip install torch==1.13.0+cu116 torchvision==0.14.0+cu116 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu116
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu116
Collecting torch==1.13.0+cu116
  Downloading https://download.pytorch.org/whl/cu116/torch-1.13.0%2Bcu116-cp310-cp310-linux_x86_64.whl (1983.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 GB 24.1 MB/s eta 0:00:00
Collecting torchvision==0.14.0+cu116
  Downloading https://download.pytorch.org/whl/cu116/torchvision-0.14.0%2Bcu116-cp310-cp310-linux_x86_64.whl (24.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 24.2/24.2 MB 52.9 MB/s eta 0:00:00
Collecting torchaudio==0.13.0
  Downloading https://download.pytorch.org/whl/cu116/torchaudio-0.13.0%2Bcu116-cp310-cp310-linux_x86_64.whl (4.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.2/4.2 MB 45.8 MB/s eta 0:00:00
Requirement already satisfied: typing-extensions in ./whisper_env3/lib/python3.10/site-packages (from torch==1.13.0+cu116) (4.12.2)
Requirement already satisfied: numpy in ./whisper_env3/lib/python3.10/site-packages (from torchvision==0.14.0+cu116) (1.26.3)
Requirement already satisfied: requests in ./whisper_env3/lib/python3.10/site-packages (from torchvision==0.14.0+cu116) (2.32.3)
Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in ./whisper_env3/lib/python3.10/site-packages (from torchvision==0.14.0+cu116) (10.2.0)
Requirement already satisfied: charset-normalizer<4,>=2 in ./whisper_env3/lib/python3.10/site-packages (from requests->torchvision==0.14.0+cu116) (3.4.0)
Requirement already satisfied: idna<4,>=2.5 in ./whisper_env3/lib/python3.10/site-packages (from requests->torchvision==0.14.0+cu116) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./whisper_env3/lib/python3.10/site-packages (from requests->torchvision==0.14.0+cu116) (2.2.3)
Requirement already satisfied: certifi>=2017.4.17 in ./whisper_env3/lib/python3.10/site-packages (from requests->torchvision==0.14.0+cu116) (2024.8.30)
Installing collected packages: torch, torchvision, torchaudio
  Attempting uninstall: torch
    Found existing installation: torch 1.13.1+cu116
    Uninstalling torch-1.13.1+cu116:
      Successfully uninstalled torch-1.13.1+cu116
  Attempting uninstall: torchvision
    Found existing installation: torchvision 0.14.1+cu116
    Uninstalling torchvision-0.14.1+cu116:
      Successfully uninstalled torchvision-0.14.1+cu116
  Attempting uninstall: torchaudio
    Found existing installation: torchaudio 0.13.1+cu116
    Uninstalling torchaudio-0.13.1+cu116:
      Successfully uninstalled torchaudio-0.13.1+cu116
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
lightning 2.4.0 requires torch<4.0,>=2.1.0, but you have torch 1.13.0+cu116 which is incompatible.
pyannote-audio 3.1.1 requires torch>=2.0.0, but you have torch 1.13.0+cu116 which is incompatible.
pyannote-audio 3.1.1 requires torchaudio>=2.0.0, but you have torchaudio 0.13.0+cu116 which is incompatible.
pytorch-lightning 2.4.0 requires torch>=2.1.0, but you have torch 1.13.0+cu116 which is incompatible.
whisperx 3.1.1 requires torch>=2, but you have torch 1.13.0+cu116 which is incompatible.
whisperx 3.1.1 requires torchaudio>=2, but you have torchaudio 0.13.0+cu116 which is incompatible.
Successfully installed torch-1.13.0+cu116 torchaudio-0.13.0+cu116 torchvision-0.14.0+cu116
(whisper_env3) khalid@khalid-Precision-7920-Tower:~$ 

For torch 2.0.0, here are the options. Notice how there's no option for CUDA 11.6 for torch 2.0.0. These commands were pulled directly from https://pytorch.org/get-started/previous-versions/

CUDA 11.7 pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 CUDA 11.8 pip install torch==2.0.0 torchvision==0.15.1 torchaudio==2.0.1 --index-url https://download.pytorch.org/whl/cu118

I will give the matrix a try tomorrow and create a new py env with cuda 12.1

Torch Version CT2 Version
2..+cu121 <=4.4.0
2..+cu124 >=4.5.0
>=2.4.0 >=4.5.0
<2.4.0 <4.5.0
kirahman2 commented 4 weeks ago

SYSTRAN/faster-whisper#1086

I upgraded to cuda 12.1 and ran the installation accordingly. I was able to confirm cuda 12.1 from my notebook. Here are the commands I ran.

source: https://pytorch.org/get-started/previous-versions/ CUDA 12.1 pip install torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 --index-url https://download.pytorch.org/whl/cu121

source: https://github.com/SYSTRAN/faster-whisper/issues/1086#issuecomment-2435924887 pip install faster-whisper ctranslate2==4.4.0

I ran into the same issue where the kernel dies. Here is the shell output.

[I 2024-10-24 20:26:57.588 ServerApp] Restoring connection for 59aba1d2-0cfe-408d-8e36-0544cb64ddee:44a0da2d-1890-413f-973d-f77c378c1c36
[I 2024-10-24 20:27:08.522 ServerApp] Saving file at /work/whisper-diarization/Whisper_Transcription_+_NeMo_Diarization.ipynb
[W 2024-10-24 20:27:13.799 ServerApp] 404 GET /api/.cache/torch/whisperx-vad-segmentation.bin?content=0&hash=0&1729819633648 (3b0f0f4d690a490f8e1b1d6254c68218@127.0.0.1) 36.71ms referer=http://localhost:8889/lab/tree/work/whisper-diarization/Whisper_Transcription_%2B_NeMo_Diarization.ipynb
[I 2024-10-24 20:27:18.536 ServerApp] AsyncIOLoopKernelRestarter: restarting kernel (1/5), keep random ports
[W 2024-10-24 20:27:18.537 ServerApp] kernel 59aba1d2-0cfe-408d-8e36-0544cb64ddee restarted
[I 2024-10-24 20:27:18.574 ServerApp] Starting buffering for 59aba1d2-0cfe-408d-8e36-0544cb64ddee:44a0da2d-1890-413f-973d-f77c378c1c36
[I 2024-10-24 20:27:18.589 ServerApp] Connecting to kernel 59aba1d2-0cfe-408d-8e36-0544cb64ddee.
[I 2024-10-24 20:27:18.589 ServerApp] Restoring connection for 59aba1d2-0cfe-408d-8e36-0544cb64ddee:44a0da2d-1890-413f-973d-f77c378c1c36

This is the notebook cell and output

compute_type = "float16"
# or run on GPU with INT8
# compute_type = "int8_float16"
# or run on CPU with INT8
# compute_type = "int8"

whisper_results, language, audio_waveform = transcribe_batched(
    vocal_target,
    language,
    batch_size,
    whisper_model_name,
    compute_type,
    suppress_numerals,
    device,
)

OUTPUT

Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.4.0. To apply the upgrade to your files permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint ../../.cache/torch/whisperx-vad-segmentation.bin`
No language specified, language will be first be detected for each audio file (increases inference time).
Model was trained with pyannote.audio 0.0.1, yours is 3.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.4.1+cu121. Bad things might happen unless you revert torch to 1.x.
[NeMo W 2024-10-24 20:27:14 nemo_logging:349] [/home/khalid/whisper_env2/lib/python3.10/site-packages/pyannote/audio/utils/reproducibility.py:74](http://localhost:8889/lab/tree/work/whisper-diarization/whisper_env2/lib/python3.10/site-packages/pyannote/audio/utils/reproducibility.py#line=73): ReproducibilityWarning: TensorFloat-32 (TF32) has been disabled as it might lead to reproducibility issues and lower accuracy.
    It can be re-enabled by calling
       >>> import torch
       >>> torch.backends.cuda.matmul.allow_tf32 = True
       >>> torch.backends.cudnn.allow_tf32 = True
    See https://github.com/pyannote/pyannote-audio/issues/1370 for more details.

      warnings.warn(

I find it odd that my local ubuntu 20.04 instance has this issue, but google colab doesn't. This seems like challenging issue, I appreciate you!

MahmoudAshraf97 commented 4 weeks ago

It's challenging indeed! but the problem boils down to conflicting cuda installations, I suggest you start with a fresh environment and install the requirements as mentioned in the readme, this is guaranteed to work and tested weekly on several python versions

kirahman2 commented 4 weeks ago

I'm ensuring my ubuntu 20.04 environment has only 1 version of cuda installed so I ran a clean uninstall + a fresh install of CUDA 12.4, but I'm still running into the issue with the kernel dying. Is it possible that I'm installing CUDA incorrectly? Or maybe the wrong CUDA version?

# remove cuda - from nvidias website
# https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#removing-cuda-toolkit
sudo apt-get --purge remove "*cuda*" "*cublas*" "*cufft*" "*cufile*" "*curand*" \
 "*cusolver*" "*cusparse*" "*gds-tools*" "*npp*" "*nvjpeg*" "nsight*" "*nvvm*" --yes

sudo apt-get autoremove --purge -V --yes

# https://developer.nvidia.com/cuda-12-4-0-download-archive?target_os=Linux&target_arch=x86_64&Distribution=Ubun>
# instal CUDA 12.4
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-4

ENV_NAME=whisper_env10_cuda124

python3.10 -m venv ~/$ENV_NAME
source ~/$ENV_NAME/bin/activate

pip install --upgrade pip
pip install ipykernel 
pip install jupyter-lab
python -m ipykernel install --user --name $ENV_NAME --display-name "$ENV_NAME"

# mahmoud whisper commands
sudo apt update && sudo apt install cython3 --yes
sudo apt update && sudo apt install ffmpeg --yes
pip install -c constraints.txt -r requirements.txt

# pytorch CUDA  12.4
# https://pytorch.org/get-started/locally/
pip3 install torch torchvision torchaudio

# CUDA 12.4
pip install ctranslate2==4.5.0

Here are some other details on my env

(whisper_env10_cuda124) khalid@khalid-Precision-7920-Tower:~/work/whisper-diarization$ pip list | grep -i "cuda"
nvidia-cuda-cupti-cu12      12.4.127
nvidia-cuda-nvrtc-cu12      12.4.127
nvidia-cuda-runtime-cu12    12.4.127
(whisper_env10_cuda124) khalid@khalid-Precision-7920-Tower:~/work/whisper-diarization$ pip list | grep -i "torch"pytorch-lightning           2.4.0
pytorch-metric-learning     2.6.0
torch                       2.5.0
torch-audiomentations       0.11.1
torch_pitch_shift           1.2.5
torchaudio                  2.5.0
torchmetrics                1.5.1
torchvision                 0.20.0
(whisper_env10_cuda124) khalid@khalid-Precision-7920-Tower:~/work/whisper-diarization$ whereis cuda
cuda: /usr/local/cuda

I didn't install nvidia-smi or nvcc because one of them results in a CUDA 10.1 installation, which could create issues with conflicting CUDA versions.

(whisper_env10_cuda124) khalid@khalid-Precision-7920-Tower:~/work/whisper-diarization$ nvidia-smi

Command 'nvidia-smi' not found, but can be installed with:

sudo apt install nvidia-340               # version 340.108-0ubuntu5.20.04.2, or
sudo apt install nvidia-utils-390         # version 390.157-0ubuntu0.20.04.1
sudo apt install nvidia-utils-450-server  # version 450.248.02-0ubuntu0.20.04.1
sudo apt install nvidia-utils-470         # version 470.256.02-0ubuntu0.20.04.1
sudo apt install nvidia-utils-470-server  # version 470.256.02-0ubuntu0.20.04.1
sudo apt install nvidia-utils-535         # version 535.183.01-0ubuntu0.20.04.1
sudo apt install nvidia-utils-535-server  # version 535.183.06-0ubuntu0.20.04.1
sudo apt install nvidia-utils-550-server  # version 550.90.07-0ubuntu0.20.04.2
sudo apt install nvidia-utils-435         # version 435.21-0ubuntu7
sudo apt install nvidia-utils-440         # version 440.82+really.440.64-0ubuntu6
sudo apt install nvidia-utils-418-server  # version 418.226.00-0ubuntu0.20.04.2

(whisper_env10_cuda124) khalid@khalid-Precision-7920-Tower:~/work/whisper-diarization$ nvcc --version

Command 'nvcc' not found, but can be installed with:

sudo apt install nvidia-cuda-toolkit
MahmoudAshraf97 commented 4 weeks ago

You did install two cudas, one using apt and another using pip On a fresh Ubuntu just install tge requirements and cuda will be installed along torch, you just might need to include their directories in PATH

kirahman2 commented 3 weeks ago

@MahmoudAshraf97 Thanks for updating the code to use fast-whisper. I just tried installing the tge requirements to a brand new docker container with ubuntu 22.04. I didn't install CUDA from nvidia's website. I let the tge requirements handle cuda. I noticed that

torch.cuda.is_available() returns true torch.version.cuda returns 12.4 However the kernel still dies.

In addition to that, I noticed when I type whereis cuda in bash, there is no path for cuda so there is no export path to add that I know of unless maybe I'm missing a step.

Since I'm running a container, I've already configured

I'm also running the commands as


# if you are installing a newly hosted ubuntu image then run 
# these commands so that docker can access the GPU 
sudo apt update
sudo apt install -y nvidia-container-toolkit
sudo nano /etc/docker/daemon.json
# add this to daemon.json
{
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}

# run docker container 
sudo docker run --gpus all -it -p 8888:8888 \
    -v /home/khalid/work:/container/work \
    -v /home/khalid/whisper_data:/container/whisper_data \
    ubuntu:22.04 /bin/bash

# run jupyter 
jupyter lab --ip=0.0.0.0 --port=8888 --allow-root --no-browser

When you do weekly testing against multiple python versions, what version of ubuntu are you using?

MahmoudAshraf97 commented 3 weeks ago

If you are going to use docker, just use a prebuilt image that has cuda set up, check nvidia or torch inage repos

kirahman2 commented 3 weeks ago

@MahmoudAshraf97 This was resolved by using a Dockerfile that I pulled from your repo here https://github.com/SYSTRAN/faster-whisper/tree/master/docker. Thanks for the assistance!