SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
11.63k stars 968 forks source link

Transcribe crash jupyter notebook kernels #820

Open GiCollini opened 4 months ago

GiCollini commented 4 months ago

The execution of the transcribe methods within a .ipynb jupyter notebook results in crashing the kernel (despite setting KMP_DUPLICATE_LIB_OK to True). However, if I running the same exact code as a .py python script it works perfectly fine.

Here is a test code

import os
import nvidia.cublas.lib
import nvidia.cudnn.lib

from faster_whisper import WhisperModel, decode_audio

os.environ['LD_LIBRARY_PATH'] = os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__)
os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'

model = WhisperModel("tiny")

left, right = decode_audio("stereo_diarization.wav", split_stereo=True)

segments, _ = model.transcribe(left)
transcription = "".join(segment.text for segment in segments).strip()

print("LEFT:")
print(transcription)
# EXPECTED
# "He began a confused complaint against the wizard, "
# "who had vanished behind the curtain on the left."

segments, _ = model.transcribe(right)
transcription = "".join(segment.text for segment in segments).strip()

print("RIGHT:")
print(transcription)
# EXPECTED
# "The horizon seems extremely distant."

The file "stereo_diarization.wav" is the one presented in the tests of this package faster-whisper/tests/data/stereo_diarization.wav

I used an the environment with Python 3.11.9 with the following package installed (full requirements here requirements.txt)

pip install faster-whisper==1.0.2 nvidia-cublas-cu12==12.4.5.8 nvidia-cudnn-cu12==8.9.7.29 jupyter==1.0.0

I tested the notebook (always crashing) and the script (always OK) on a cloud instance with Ubuntu 22.04 (AWS EC2 ml.g4dn.xlarge) with NVIDIA Driver Version: 535.129.03 and CUDA Toolkit Version: 12.2

benniekiss commented 4 months ago

I am experiencing the same issue. Have you found any effective workaround or root cause?

Null404bad commented 4 months ago

Have you solved this? I have been suffering from this recently.

benniekiss commented 4 months ago

Have you solved this? I have been suffering from this recently.

I dont know the underlying cause, but the LD_LIBRARY_PATH env had become unset, and setting it before starting the jupyter server stopped the crashes.

I never had explicitly set it before and faster-whisper was working previously, but this has fixed my problem for now

Null404bad commented 4 months ago

Have you solved this? I have been suffering from this recently.

I dont know the underlying cause, but the LD_LIBRARY_PATH env had become unset, and setting it before starting the jupyter server stopped the crashes.

I never had explicitly set it before and faster-whisper was working previously, but this has fixed my problem for now

Thanks for your response. I also fix this but by uninstalling cuda 12, then installing cuda 11 (specifically 11.8) and downgrade the faster-whisper. The latest faster-whisper requires cuda 12 but cudnn for cuda 11. I guess most crash issues are caused by this. This is really weird, anyway.

See #717

GiCollini commented 4 months ago

LD_LIBRARY_PATH env had become unset, and setting it before starting the jupyter server stopped the crashes.

Appartently once the jupyter kernel is running changing the LD_LIBRARY_PATH has no effect. So setting LD_LIBRARY_PATH in the code works for running .py script, but not when running jupyter notebooks.

I am using vscode with the Jupyter extension for running the jupyter notebook (actually on a AWS Sagemaker Code Editor instance). I was able to setting the LD_LIBRARY_PATH by:

Apparently changing the original kernel.json is not used when starting up the vscode jupyter kernel corresponding to the conda environment

GiCollini commented 4 months ago

Have you solved this? I have been suffering from this recently.

I dont know the underlying cause, but the LD_LIBRARY_PATH env had become unset, and setting it before starting the jupyter server stopped the crashes. I never had explicitly set it before and faster-whisper was working previously, but this has fixed my problem for now

Thanks for your response. I also fix this but by uninstalling cuda 12, then installing cuda 11 (specifically 11.8) and downgrade the faster-whisper. The latest faster-whisper requires cuda 12 but cudnn for cuda 11. I guess most crash issues are caused by this. This is really weird, anyway.

See #717

Actually faster-whisper==1.0.2 does not require any CUDA 11 dependences. This issue has been solved

Null404bad commented 4 months ago

Have you solved this? I have been suffering from this recently.

I dont know the underlying cause, but the LD_LIBRARY_PATH env had become unset, and setting it before starting the jupyter server stopped the crashes. I never had explicitly set it before and faster-whisper was working previously, but this has fixed my problem for now

Thanks for your response. I also fix this but by uninstalling cuda 12, then installing cuda 11 (specifically 11.8) and downgrade the faster-whisper. The latest faster-whisper requires cuda 12 but cudnn for cuda 11. I guess most crash issues are caused by this. This is really weird, anyway. See #717

Actually faster-whisper==1.0.2 does not require any CUDA 11 dependences. This issue has been solved

Thanks for sharing your solution. My current model in running normally and I'm exhausted by solving this problem : ( But I'll try this next time if it happens again. I'd like to be informed if this solution works for others.

nguyenhoanganh2002 commented 2 months ago

same issues