The API call failed because the CUDA driver and runtime could not be initialized

wenxuezhang commented 1 year ago

系统环境/System Environment： CentOS Linux release 7.6.1810 (Core)
版本号/Version： Paddle：paddlepaddle-gpu 2.3.1，paddlepaddle-gpu 2.3.1 PaddleOCR： paddleocr 2.5.0.3
运行指令/Command Code： ppocr DEBUG: Namespace(alpha=1.0, benchmark=False, beta=1.0, cls_batch_num=6, cls_image_shape='3, 48, 192', cls_model_dir='/opt/zhangwenxue/VIMER/StrucTexT/ocr/model/ch_ppocr_mobile_v2.0_cls_infer', cls_thresh=0.9, cpu_threads=10, crop_res_save_dir='./output', det=True, det_algorithm='DB', det_db_box_thresh=0.6, det_db_score_mode='fast', det_db_thresh=0.3, det_db_unclip_ratio=1.5, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_east_score_thresh=0.8, det_fce_box_type='poly', det_limit_side_len=960, det_limit_type='max', det_model_dir='/opt/zhangwenxue/VIMER/StrucTexT/ocr/model/ch_PP-OCRv3_det_infer', det_pse_box_thresh=0.85, det_pse_box_type='quad', det_pse_min_area=16, det_pse_scale=1, det_pse_thresh=0, det_sast_nms_thresh=0.2, det_sast_polygon=False, det_sast_score_thresh=0.5, draw_img_save_dir='./inference_results', drop_score=0.5, e2e_algorithm='PGNet', e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_limit_side_len=768, e2e_limit_type='max', e2e_model_dir=None, e2e_pgnet_mode='fast', e2e_pgnet_score_thresh=0.5, e2e_pgnet_valid_set='totaltext', enable_mkldnn=False, fourier_degree=5, gpu_mem=500, help='==SUPPRESS==', image_dir=None, ir_optim=True, label_list=['0', '180'], lang='ch', layout=True, layout_label_map=None, layout_path_model='lp://PubLayNet/ppyolov2_r50vd_dcn_365e_publaynet/config', max_batch_size=10, max_text_length=25, min_subgraph_size=15, mode='structure', ocr=True, ocr_version='PP-OCRv3', output='./output', precision='fp32', process_id=0, rec=True, rec_algorithm='SVTR_LCNet', rec_batch_num=6, rec_char_dict_path='/opt/zhangwenxue/VIMER/StrucTexT/ocr/config/char_conf/ppocr_keys_v1.txt', rec_char_type='ch', rec_image_shape='3, 48, 320', rec_model_dir='/opt/zhangwenxue/VIMER/StrucTexT/ocr/model/ch_PP-OCRv3_rec_infer', save_crop_res=False, save_log_path='./log_output/', scales=[8, 16, 32], show_log=True, structure_version='PP-STRUCTURE', table=True, table_char_dict_path=None, table_max_len=488, table_model_dir=None, total_process_num=1, type='ocr', use_angle_cls=True, use_dilation=False, use_gpu=True, use_mp=False, use_onnx=False, use_pdserving=False, use_space_char=True, use_tensorrt=False, version='PP-OCRv3', vis_font_path='/opt/zhangwenxue/VIMER/StrucTexT/ocr/config/fonts/sim_fang.ttf', warmup=False) ocr.ocr（file_name, cls=True)
完整报错/Complete Error Message： (External) CUDA error(3), initialization error. [Hint: 'cudaErrorInitializationError'. The API call failed because the CUDA driver and runtime could not be initialized. ] (at /paddle/paddle/phi/backends/gpu/cuda/cuda_info.cc:172)

LDOUBLEV commented 1 year ago

检查paddle是否正确安装

import paddle paddle.utils.run_check()

# Running verify PaddlePaddle program ...
# W1010 07:21:14.972093  8321 device_context.cc:338] Please NOTE: device: 0, CUDA Capability: 70, Driver API Version: 11.0, Runtime API Version: 10.1
# W1010 07:21:14.979770  8321 device_context.cc:346] device: 0, cuDNN Version: 7.6.
# PaddlePaddle works well on 1 GPU.
# PaddlePaddle works well on 8 GPUs.
# PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

wenxuezhang commented 1 year ago

这个问题百度解决的太慢了，大家都卡在这里用不了了

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

nilyang commented 4 months ago

我是在 Celery 异步队列中执行 PaddleOCR版面分析出现这个错误的，单独执行没有这个错误，后来是调用Celery命令参数影响：

服务器配置

Python 3.10.14 (main, May  6 2024, 19:42:50) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import paddle
WARNING: OMP_NUM_THREADS set to 12, not 1. The computation speed will not be optimized if you use data parallel. It will fail if this PaddlePaddle binary is compiled with OpenBlas since OpenBlas does not support multi-threads.
PLEASE USE OMP_NUM_THREADS WISELY.
>>> paddle.utils.run_check()
Running verify PaddlePaddle program ... 
I0527 16:23:39.928263 743940 program_interpreter.cc:212] New Executor is Running.
W0527 16:23:39.928719 743940 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.2, Runtime API Version: 11.8
W0527 16:23:39.954921 743940 gpu_resources.cc:164] device: 0, cuDNN Version: 8.6.
I0527 16:23:40.123919 743940 interpreter_util.cc:624] Standalone Executor is Used.
PaddlePaddle works well on 1 GPU.
PaddlePaddle is installed successfully! Let's start deep learning with PaddlePaddle now.

出现错误的情况

OMP_NUM_THREADS=1 && celery -A celery_tasks.main worker -l INFO -c 4

此时会在table_engine = PPStructure(...) 这个对象创建时报错：

Traceback (most recent call last):
  File "/root/miniconda3/envs/paddle/lib/python3.10/site-packages/celery/app/trace.py", line 453, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/root/miniconda3/envs/paddle/lib/python3.10/site-packages/celery/app/trace.py", line 736, in __protected_call__
    return self.run(*args, **kwargs)
  File "/root/autodl-tmp/projects/pdfocr/celery_tasks/ocr/tasks.py", line 117, in step3_ocr_charts
    pp_layout.save_pdf_charts(pdf_info=pdf_info)
  File "/root/autodl-tmp/projects/pdfocr/celery_tasks/../paddle_ocr/pp_layout.py", line 48, in save_pdf_charts
    table_engine = PPStructure(show_log=pp_show_log,
  File "/root/autodl-tmp/projects/pdfocr/paddle_ocr/paddleocr.py", line 778, in __init__
    super().__init__(params)
  File "/root/autodl-tmp/projects/pdfocr/celery_tasks/../paddle_ocr/ppstructure/predict_system.py", line 67, in __init__
    self.layout_predictor = LayoutPredictor(args)
  File "/root/autodl-tmp/projects/pdfocr/celery_tasks/../paddle_ocr/ppstructure/layout/predict_layout.py", line 68, in __init__
    utility.create_predictor(args, 'layout', logger)
  File "/root/autodl-tmp/projects/pdfocr/paddle_ocr/tools/infer/utility.py", line 293, in create_predictor
    predictor = inference.create_predictor(config)
OSError: (External) CUDA error(3), initialization error. 
  [Hint: 'cudaErrorInitializationError'. The API call failed because the CUDA driver and runtime could not be initialized. ] (at /paddle/paddle/phi/backends/gpu/cuda/cuda_info.cc:256)

恢复正常的方法

Celery 在指定多线程池配置-P gevent参数后，就正常了

OMP_NUM_THREADS=1 && celery -A celery_tasks.main worker -l INFO -c 4 -P gevent

PaddlePaddle / PaddleOCR

The API call failed because the CUDA driver and runtime could not be initialized #9155