opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
https://opendatalab.com/OpenSourceTools
GNU Affero General Public License v3.0
11.72k stars 881 forks source link

`Segmentation fault` is detected by the operating system. #658

Open Justin18Chan opened 6 days ago

Justin18Chan commented 6 days ago

Description of the bug | 错误描述

ubuntu 22.04 安装paddle-gpu后 (python -m pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118/),执行加速报错。 日志如下: python magic_pdf_parse_main.py 2024-09-24 16:16:37.395 INFO magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 36888, cid_chars_radio: 0.0 2024-09-24 16:16:46.593 INFO magic_pdf.model.pdf_extract_kit:init:180 - DocAnalysis init, this may take some times. apply_layout: True, apply_formula: True, apply_ocr: False, apply_table: True 2024-09-24 16:16:46.594 INFO magic_pdf.model.pdf_extract_kit:init:188 - using device: cuda:1 2024-09-24 16:16:46.594 INFO magic_pdf.model.pdf_extract_kit:init:190 - using models_dir: /data2/ModelBank/PDF-Extract-Kit/models CustomVisionEncoderDecoderModel init CustomMBartForCausalLM init CustomMBartDecoder init [09/24 16:17:04 detectron2]: Rank of current process: 0. World size: 1 [09/24 16:17:05 detectron2]: Environment info:
sys.platform linux Python 3.10.15 packaged by conda-forge (main, Sep 20 2024, 16:37:05) [GCC 13.3.0] numpy 1.26.4 detectron2 0.6 @/data2/conda/MinerU/lib/python3.10/site-packages/detectron2 Compiler GCC 11.4 CUDA compiler not available DETECTRON2_ENV_MODULE PyTorch 2.3.1+cu121 @/data2/conda/MinerU/lib/python3.10/site-packages/torch PyTorch debug build False torch._C._GLIBCXX_USE_CXX11_ABI False GPU available Yes GPU 0,1,2 NVIDIA GeForce RTX 4090 (arch=8.9) Driver version 535.183.01 CUDA_HOME /usr/local/cuda-11.8 Pillow 10.4.0 torchvision 0.18.1+cu121 @/data2/conda/MinerU/lib/python3.10/site-packages/torchvision torchvision arch flags 5.0, 6.0, 7.0, 7.5, 8.0, 8.6, 9.0 fvcore 0.1.5.post20221221 iopath 0.1.9 cv2 4.6.0

PyTorch built with:

[09/24 16:17:05 detectron2]: Command line arguments: {'config_file': '/data2/conda/MinerU/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml', 'resume': False, 'eval_only': False, 'num_gpus': 1, 'num_machines': 1, 'machine_rank': 0, 'dist_url': 'tcp://127.0.0.1:57823', 'opts': ['MODEL.WEIGHTS', '/data2/ModelBank/PDF-Extract-Kit/models/Layout/model_final.pth']} [09/24 16:17:05 detectron2]: Contents of args.config_file=/data2/conda/MinerU/lib/python3.10/site-packages/magic_pdf/resources/model_config/layoutlmv3/layoutlmv3_base_inference.yaml: AUG: DETR: true CACHE_DIR: ~/cache/huggingface CUDNN_BENCHMARK: false DATALOADER: ASPECT_RATIO_GROUPING: true FILTER_EMPTY_ANNOTATIONS: false NUM_WORKERS: 4 REPEAT_THRESHOLD: 0.0 SAMPLER_TRAIN: TrainingSampler DATASETS: PRECOMPUTED_PROPOSAL_TOPK_TEST: 1000 PRECOMPUTED_PROPOSAL_TOPK_TRAIN: 2000 PROPOSAL_FILES_TEST: [] PROPOSAL_FILES_TRAIN: [] TEST:

[09/24 16:17:08 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /data2/ModelBank/PDF-Extract-Kit/models/Layout/model_final.pth ... [09/24 16:17:08 fvcore.common.checkpoint]: [Checkpointer] Loading from /data2/ModelBank/PDF-Extract-Kit/models/Layout/model_final.pth ... 2024-09-24 16:17:10.585 | INFO | magic_pdf.model.pdf_extract_kit:init:248 - DocAnalysis init done! 2024-09-24 16:17:10.586 | INFO | magic_pdf.model.doc_analyze_by_custom_model:custom_model_init:98 - model init cost: 33.188966035842896


C++ Traceback (most recent call last):

0 at::_ops::conv2d::call(at::Tensor const&, at::Tensor const&, std::optional const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, c10::SymInt) 1 at::native::conv2d_symint(at::Tensor const&, at::Tensor const&, std::optional const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, c10::SymInt) 2 at::_ops::convolution::call(at::Tensor const&, at::Tensor const&, std::optional const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, c10::SymInt) 3 at::_ops::convolution::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&, std::optional const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, c10::SymInt) 4 at::native::convolution(at::Tensor const&, at::Tensor const&, std::optional const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, long) 5 at::_ops::_convolution::call(at::Tensor const&, at::Tensor const&, std::optional const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, c10::SymInt, bool, bool, bool, bool) 6 at::native::_convolution(at::Tensor const&, at::Tensor const&, std::optional const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, bool, c10::ArrayRef, long, bool, bool, bool, bool) 7 at::_ops::cudnn_convolution::call(at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, c10::SymInt, bool, bool, bool) 8 at::native::cudnn_convolution(at::Tensor const&, at::Tensor const&, c10::ArrayRef, c10::ArrayRef, c10::ArrayRef, long, bool, bool, bool)


Error Message Summary:

FatalError: Segmentation fault is detected by the operating system. [TimeInfo: Aborted at 1727165830 (unix time) try "date -d @1727165830" if you are using GNU date ] [SignalInfo: SIGSEGV (@0x20000002ef4) received by PID 4020567 (TID 0x76d3250ce740) from PID 12020 ]

段错误

How to reproduce the bug | 如何复现

{ "bucket_info":{ "bucket-name-1":["ak", "sk", "endpoint"], "bucket-name-2":["ak", "sk", "endpoint"] }, "models-dir":"/data2/ModelBank/PDF-Extract-Kit/models", "device-mode":"cuda:1", "table-config": { "model": "TableMaster", "is_table_recog_enable": true, "max_time": 400 } }

Package Version


absl-py 2.1.0 aiohappyeyeballs 2.4.0 aiohttp 3.10.5 aiosignal 1.3.1 albucore 0.0.17 albumentations 1.4.16 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.6.0 astor 0.8.1 async-timeout 4.0.3 attrdict 2.0.1 attrs 24.2.0 babel 2.16.0 bce-python-sdk 0.9.22 beautifulsoup4 4.12.3 black 24.8.0 blinker 1.8.2 boto3 1.35.25 botocore 1.35.25 braceexpand 0.1.7 Brotli 1.1.0 cachetools 5.5.0 certifi 2024.8.30 cffi 1.17.1 charset-normalizer 3.3.2 click 8.1.7 cloudpickle 3.0.0 colorlog 6.8.2 contourpy 1.3.0 cryptography 43.0.1 cssselect 1.2.0 cssutils 2.11.1 cycler 0.12.1 Cython 3.0.11 datasets 3.0.0 decorator 5.1.1 detectron2 0.6 dill 0.3.8 et-xmlfile 1.1.0 eva-decord 0.6.1 eval_type_backport 0.2.0 evaluate 0.4.3 exceptiongroup 1.2.2 fairscale 0.4.13 fast-langdetect 0.2.0 fasttext-wheel 0.9.2 filelock 3.16.1 fire 0.6.0 Flask 3.0.3 flask-babel 4.0.0 fonttools 4.54.0 frozenlist 1.4.1 fsspec 2024.6.1 ftfy 6.2.3 future 1.0.0 fvcore 0.1.5.post20221221 grpcio 1.66.1 h11 0.14.0 httpcore 1.0.5 httpx 0.27.2 huggingface-hub 0.25.1 hydra-core 1.3.2 idna 3.10 imageio 2.35.1 imgaug 0.4.0 iopath 0.1.9 itsdangerous 2.2.0 Jinja2 3.1.4 jmespath 1.0.1 joblib 1.4.2 kiwisolver 1.4.7 langdetect 1.0.9 lazy_loader 0.4 lmdb 1.5.1 loguru 0.7.2 lxml 5.3.0 magic-pdf 0.8.1 Markdown 3.7 MarkupSafe 2.1.5 matplotlib 3.9.2 more-itertools 10.5.0 mpmath 1.3.0 multidict 6.1.0 multiprocess 0.70.16 mypy-extensions 1.0.0 networkx 3.3 numpy 1.26.4 nvidia-cublas-cu11 11.11.3.6 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu11 11.8.87 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu11 11.8.89 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu11 11.8.89 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu11 8.7.0.84 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu11 10.9.0.58 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu11 10.3.0.86 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu11 11.4.1.48 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu11 11.7.5.86 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu11 2.19.3 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.6.68 nvidia-nvtx-cu11 11.8.86 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 opencv-contrib-python 4.6.0.66 opencv-python 4.6.0.66 opencv-python-headless 4.10.0.84 openpyxl 3.1.5 opt-einsum 3.3.0 packaging 24.1 paddleocr 2.7.3 paddlepaddle 3.0.0b1 paddlepaddle-gpu 3.0.0b1 pandas 2.2.3 pathspec 0.12.1 pdf2docx 0.5.8 pdfminer.six 20231228 pillow 10.4.0 pip 24.2 platformdirs 4.3.6 portalocker 2.10.1 premailer 3.10.0 protobuf 5.28.2 psutil 6.0.0 py-cpuinfo 9.0.0 pyarrow 17.0.0 pybind11 2.13.6 pyclipper 1.3.0.post5 pycocotools 2.0.8 pycparser 2.22 pycryptodome 3.20.0 pydantic 2.7.4 pydantic_core 2.18.4 PyMuPDF 1.24.10 PyMuPDFb 1.24.10 pypandoc 1.13 pyparsing 3.1.4 python-dateutil 2.9.0.post0 python-docx 1.1.2 pytz 2024.2 PyYAML 6.0.2 RapidFuzz 3.10.0 rarfile 4.2 regex 2024.9.11 requests 2.32.3 robust-downloader 0.0.2 s3transfer 0.10.2 safetensors 0.4.5 scikit-image 0.24.0 scikit-learn 1.5.2 scipy 1.14.1 seaborn 0.13.2 setuptools 74.1.2 shapely 2.0.6 six 1.16.0 sniffio 1.3.1 soupsieve 2.6 struct-eqtable 0.1.0 sympy 1.13.3 tabulate 0.9.0 tensorboard 2.17.1 tensorboard-data-server 0.7.2 termcolor 2.4.0 threadpoolctl 3.5.0 tifffile 2024.9.20 timm 0.9.16 tokenizers 0.19.1 tomli 2.0.1 torch 2.3.1 torchtext 0.18.0 torchvision 0.18.1 tqdm 4.66.5 transformers 4.40.0 triton 2.3.1 typing_extensions 4.12.2 tzdata 2024.2 ultralytics 8.2.100 ultralytics-thop 2.0.8 unimernet 0.1.6 urllib3 2.2.3 visualdl 2.5.3 Wand 0.6.13 wcwidth 0.2.13 webdataset 0.2.100 Werkzeug 3.0.4 wheel 0.44.0 wordninja 2.0.0 xxhash 3.5.0 yacs 0.1.8 yarl 1.12.1

Operating system | 操作系统

Linux

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.8.x

Device mode | 设备模式

cuda

Justin18Chan commented 6 days ago

10.1016_j.envint.2011.03.001.pdf

randydl commented 6 days ago

我也遇到过同样的错误,改变安装顺序可以解决,如下:

ENV_NAME=data && \
conda activate base && \
conda remove -y -n $ENV_NAME --all && \
conda create -y -n $ENV_NAME python=3.10 && \
conda activate $ENV_NAME && \
pip install "/nas_data/userdata/randy/tools/paddlepaddle_gpu-3.0.0b1-cp310-cp310-linux_x86_64.whl" && \
pip install -U "magic-pdf[full]" "/nas_data/userdata/randy/tools/detectron2-0.6-cp310-cp310-linux_x86_64.whl"

先安装paddlepaddle-gpu,再安装magic-pdf

myhloli commented 6 days ago

我也遇到过同样的错误,改变安装顺序可以解决,如下:


ENV_NAME=data && \

conda activate base && \

conda remove -y -n $ENV_NAME --all && \

conda create -y -n $ENV_NAME python=3.10 && \

conda activate $ENV_NAME && \

pip install "/nas_data/userdata/randy/tools/paddlepaddle_gpu-3.0.0b1-cp310-cp310-linux_x86_64.whl" && \

pip install -U "magic-pdf[full]" "/nas_data/userdata/randy/tools/detectron2-0.6-cp310-cp310-linux_x86_64.whl"

先安装paddlepaddle-gpu,再安装magic-pdf

这样不能使用paddle的gpu加速功能,感觉跟没装paddlegpu的效果是一样的

randydl commented 5 days ago

我也遇到过同样的错误,改变安装顺序可以解决,如下:

ENV_NAME=data && \

conda activate base && \

conda remove -y -n $ENV_NAME --all && \

conda create -y -n $ENV_NAME python=3.10 && \

conda activate $ENV_NAME && \

pip install "/nas_data/userdata/randy/tools/paddlepaddle_gpu-3.0.0b1-cp310-cp310-linux_x86_64.whl" && \

pip install -U "magic-pdf[full]" "/nas_data/userdata/randy/tools/detectron2-0.6-cp310-cp310-linux_x86_64.whl"

先安装paddlepaddle-gpu,再安装magic-pdf

这样不能使用paddle的gpu加速功能,感觉跟没装paddlegpu的效果是一样的

确实,试了一下,跟cpu的效果是一样的,没有办法,这个bug目前解决不了,找不到原因