SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
10.18k stars 855 forks source link

CUDA Initialisation error #776

Open RohitMidha23 opened 3 months ago

RohitMidha23 commented 3 months ago

I get the following error when I run both, a huggingface model and a faster whisper model on the same GPU:

self.model = ctranslate2.models.Whisper(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: CUDA failed with error initialization error
terminate called after throwing an instance of 'std::runtime_error'
  what():  CUDA failed with error unspecified launch failure
Aborted

I am currently using a pipeline from huggingface, which is defined as follows:

translator = pipeline(
        "translation",
        model=model_nllb,
        tokenizer=tokenizer_nllb,
        src_lang=source_lang,
        tgt_lang=target_lang,
        max_length=400,
        device="cuda" if torch.cuda.is_available() else "cpu",
    )

It is not an OOM error as I'm able to load both of them separately and check memory usage!

RohitMidha23 commented 3 months ago

For those who might also face this same issue, the fix is to load faster_whisper models first and then the HuggingFace model. Not sure why that works.

RohitMidha23 commented 3 months ago

Still facing this issue with other models. @trungkienbkhn any idea?

trungkienbkhn commented 3 months ago

@RohitMidha23 , hello. Have you installed NVIDIA libraries (cuBLAS + cuDNN) for GPU ? Besides, could you show your full code ?