PaddlePaddle / PaddleMIX

Paddle Multimodal Integration and eXploration, supporting mainstream multi-modal tasks, including end-to-end large-scale multi-modal pretrain models and diffusion model toolbox. Equipped with high performance and flexibility.
Apache License 2.0
297 stars 112 forks source link

CUDNN_STATUS_NOT_INITIALIZED 错误 #678

Open flymorn opened 4 weeks ago

flymorn commented 4 weeks ago

运行 示例程序-自动标注: python applications/Automatic_label/automatic_label.py

系统报错: CUDNN_STATUS_NOT_INITIALIZED'. The cuDNN library was not initialized properly. This error is usually returned when a call to cudnnCreate() fails or when cudnnCreate() has not been called prior to calling another cuDNN routine. In the former case, it is usually due to an error in the CUDA Runtime API called by cudnnCreate() or by an error in the hardware setup

具体信息如下:

[2024-08-15 21:41:57,506] [ WARNING] - Detected that datasets module was imported before paddlenlp. This may cause PaddleNLP datasets to be unavalible in intranet. Please import paddlenlp before datasets module to avoid download issues
2024-08-15 21:42:05,553-WARNING: post-quant-hpo is not support in system other than linux
W0815 21:42:07.605979  1704 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.2, Runtime API Version: 11.8
W0815 21:42:07.606978  1704 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
C:\Program Files\Python38\lib\site-packages\paddleaudio\_extension.py:141: UserWarning: paddleaudio C++ extension is not available. sox_io, sox_effect, kaldi raw feature is not supported!!!
...
...
Traceback (most recent call last):
  File "automatic_label.py", line 46, in <module>
    result = task(image=image_pil, blip2_prompt=blip2_prompt)
  File "D:\Python\projects\PaddleMIX\PaddleMIX-2.0.0\paddlemix\appflow\appflow.py", line 86, in __call__
    results = task_instance(results)
  File "D:\Python\projects\PaddleMIX\PaddleMIX-2.0.0\paddlemix\appflow\apptask.py", line 236, in __call__
    outputs = self._run_model(inputs, **kwargs)
  File "D:\Python\projects\PaddleMIX\PaddleMIX-2.0.0\paddlemix\appflow\image2text_generation.py", line 91, in _run_model
    generated_ids, _ = self._model.generate(**inputs["blip2_input"])
  File "<decorator-gen-758>", line 2, in generate
  File "C:\Program Files\Python38\lib\site-packages\paddle\base\dygraph\base.py", line 337, in _decorate_function
    return func(*args, **kwargs)
  File "D:\Python\projects\PaddleMIX\PaddleMIX-2.0.0\paddlemix\models\blip2\modeling.py", line 534, in generate
    self.visual_encoder(pixel_values.cast(self.visual_encoder.pos_embed.dtype))
  File "C:\Program Files\Python38\lib\site-packages\paddle\nn\layer\layers.py", line 1426, in __call__
    return self.forward(*inputs, **kwargs)
  File "D:\Python\projects\PaddleMIX\PaddleMIX-2.0.0\paddlemix\models\blip2\eva_vit.py", line 453, in forward
    x = self.forward_features(pixel_values)
  File "D:\Python\projects\PaddleMIX\PaddleMIX-2.0.0\paddlemix\models\blip2\eva_vit.py", line 431, in forward_features
    x = self.patch_embed(x)
  File "C:\Program Files\Python38\lib\site-packages\paddle\nn\layer\layers.py", line 1426, in __call__
    return self.forward(*inputs, **kwargs)
  File "D:\Python\projects\PaddleMIX\PaddleMIX-2.0.0\paddlemix\models\blip2\eva_vit.py", line 345, in forward
    x = self.proj(x).flatten(2).transpose((0, 2, 1))
  File "C:\Program Files\Python38\lib\site-packages\paddle\nn\layer\layers.py", line 1426, in __call__
    return self.forward(*inputs, **kwargs)
  File "C:\Program Files\Python38\lib\site-packages\paddle\nn\layer\conv.py", line 711, in forward
    out = F.conv._conv_nd(
  File "C:\Program Files\Python38\lib\site-packages\paddle\nn\functional\conv.py", line 127, in _conv_nd
    pre_bias = _C_ops.conv2d(
OSError: (External) CUDNN error(1), CUDNN_STATUS_NOT_INITIALIZED.
  [Hint: 'CUDNN_STATUS_NOT_INITIALIZED'.  The cuDNN library was not initialized properly. This error is usually returned when a call to cudnnCreate() fails or when cudnnCreate() has not been called prior to calling another cuDNN routine. In the former case, it is usually due to an error in the CUDA Runtime API called by cudnnCreate() or by an error in the hardware setup.  ] (at ..\paddle\phi\backends\gpu\gpu_resources.cc:308)

运行环境: win10 python 3.8

paddle2onnx 1.0.6 paddleaudio 1.1.0 paddlefsl 1.1.0 paddlehub 2.4.0 paddlemix 0.1.0 D:\Python\projects\PaddleMIX\PaddleMIX-2.0.0 paddlenlp 2.7.2 D:\Python\projects\PaddleNLP\PaddleNLP-2.7.2 paddlepaddle-gpu 3.0.0b1 paddlesde 0.2.5 paddleslim 2.6.0 paddlespeech 1.4.1 paddlespeech-feat 0.1.0 ppdiffusers 0.24.1 d:\python\projects\paddlemix\paddlemix-2.0.0\ppdiffusers

torch 2.4.0+cu118 torchaudio 2.4.0+cu118 torchvision 0.19.0+cu118

paddle是按照 https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/windows-pip.html 文档建议的版本安装的。

请教如何排查问题,谢谢。

LokeZhou commented 3 weeks ago

看报错是cuda环境的问题,可在当前环境运行paddle.utils.run_check()是否成功执行。同时看到环境中存在torch,torch安装时会带来一些cuda包,可能会造成兼容性问题,可以先卸掉torch试试。