opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
https://mineru.readthedocs.io/
GNU Affero General Public License v3.0
16.61k stars 1.2k forks source link

一个服务化的可多GPU并行处理的方案(基于LitServe) #667

Closed randydl closed 1 week ago

randydl commented 1 month ago

支持传入jpg、png、pdf路径。批量处理的话大家只需要简单的多线程调用客户端的do_parse函数就可以了,服务端会自动在多个GPU上并行处理。

pip install -U litserve python-multipart filetype
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1
pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com
pip install paddlepaddle-gpu==3.0.0b1 -i https://www.paddlepaddle.org.cn/packages/stable/cu118

server.py

import torch
import filetype
import json, uuid
import litserve as ls
from unittest.mock import patch
from fastapi import HTTPException
from magic_pdf.tools.common import do_parse
from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton

class MinerUAPI(ls.LitAPI):
    def __init__(self, output_dir='/tmp'):
        self.output_dir = output_dir

    @staticmethod
    def clean_memory(device):
        import gc
        if torch.cuda.is_available():
            with torch.cuda.device(device):
                torch.cuda.empty_cache()
                torch.cuda.ipc_collect()
        gc.collect()

    def setup(self, device):
        with patch('magic_pdf.model.doc_analyze_by_custom_model.get_device') as mock_obj:
            mock_obj.return_value = device
            model_manager = ModelSingleton()
            model_manager.get_model(True, False)
            model_manager.get_model(False, False)
            mock_obj.assert_called()
            print(f'Model initialization complete!')

    def decode_request(self, request):
        file = request['file'].file.read()
        kwargs = json.loads(request['kwargs'])
        assert filetype.guess_mime(file) == 'application/pdf'
        return file, kwargs

    def predict(self, inputs):
        try:
            pdf_name = str(uuid.uuid4())
            do_parse(self.output_dir, pdf_name, inputs[0], [], **inputs[1])
            return pdf_name
        except Exception as e:
            raise HTTPException(status_code=500, detail=f'{e}')
        finally:
            self.clean_memory(self.device)

    def encode_response(self, response):
        return {'output_dir': response}

if __name__ == '__main__':
    server = ls.LitServer(MinerUAPI(), accelerator='gpu', devices=[0, 1], timeout=False)
    server.run(port=8000)

client.py

import json
import pymupdf
import requests
import numpy as np
from loguru import logger
from joblib import Parallel, delayed

def to_pdf(file_path):
    with pymupdf.open(file_path) as f:
        if f.is_pdf:
            pdf_bytes = f.tobytes()
        else:
            pdf_bytes = f.convert_to_pdf()
        return pdf_bytes

def do_parse(file_path, url='http://127.0.0.1:8000/predict', **kwargs):
    try:
        kwargs.setdefault('parse_method', 'auto')
        kwargs.setdefault('debug_able', False)

        response = requests.post(url,
            data={'kwargs': json.dumps(kwargs)},
            files={'file': to_pdf(file_path)}
        )

        if response.status_code == 200:
            output = response.json()
            output['file_path'] = file_path
            return output
        else:
            raise Exception(response.text)
    except Exception as e:
        logger.error(f'File: {file_path} - Info: {e}')

if __name__ == '__main__':
    files = ['/tmp/small_ocr.pdf']
    n_jobs = np.clip(len(files), 1, 4)
    results = Parallel(n_jobs, prefer='threads', verbose=10)(
        delayed(do_parse)(p) for p in files
    )
    print(results)
BlackMoki-bot commented 1 month ago

你好,我在运行代码时,服务器端一直报Exception: Parsing error: 'Layoutlmv3_Predictor' object has no attribute 'parameters',客户端一直报requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http://127.0.0.1:8000/predict 但http://127.0.0.1能正常访问,请问这是什么原因呀?跪求大佬指教!

randydl commented 1 month ago

你好,我在运行代码时,服务器端一直报Exception: Parsing error: 'Layoutlmv3_Predictor' object has no attribute 'parameters',客户端一直报requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http://127.0.0.1:8000/predicthttp://127.0.0.1能正常访问,请问这是什么原因呀?跪求大佬指教!

看样子是你的处理代码有问题,不是服务的问题

flow3rdown commented 1 month ago

使用这个代码后,表格识别变得巨慢,是什么原因呢?

randydl commented 1 month ago

使用这个代码后,表格识别变得巨慢,是什么原因呢?

你不使用服务化的方式,用magic-pdf cli的方式慢吗?

flow3rdown commented 1 month ago

使用这个代码后,表格识别变得巨慢,是什么原因呢?

你不使用服务化的方式,用magic-pdf cli的方式慢吗?

这样的话速度是正常的,表格识别用的TableMaster

PoisonousBromineChan commented 1 month ago

代码实际上没看懂咋用,就习惯性地先开server.py,把client.py里面的文件路径改成自己的再启动。结果发现报错和small_ocr.pdf有关,明明我要处理的文件都没有small_ocr.pdf了,不知道如何解决。 有没有简单一点的方法,比如直接改magic-pdf.json?把里面设备一栏改成多CUDA的?

randydl commented 1 month ago

应该是你的代码改错了吧,我这边正常运行,改了文件路径怎么可能还有small_ocr.pdf,这只是个example file @PoisonousBromineChan

flow3rdown commented 1 month ago

应该是你的代码改错了吧,我这边正常运行,改了文件路径怎么可能还有small_ocr.pdf,这只是个example file @PoisonousBromineChan

请问您这边跑的时候表格识别速度正常吗?

ywh-my commented 4 weeks ago

感谢,跑通了。额外安装库 pip install python-multipart,然后启动服务器程序就请求成功了。 另外如果希望仅仅输出.md文件来节省存储空间和速度的话可以: from magic_pdf.libs.MakeContentConfig import MakeMode # 添加这行

修改do parse 函数:

        do_parse(self.output_dir,
                  pdf_name, inputs[0],
                    [],
                    **inputs[1],
                    f_draw_span_bbox=False,
                    f_draw_layout_bbox=False,
                    f_dump_md=True,
                    f_dump_middle_json=False,
                    f_dump_model_json=False,
                    f_dump_orig_pdf=False,
                    f_dump_content_list=False,
                    f_make_md_mode=MakeMode.MM_MD,
                    f_draw_model_bbox=False)
randydl commented 4 weeks ago

应该是你的代码改错了吧,我这边正常运行,改了文件路径怎么可能还有small_ocr.pdf,这只是个example file @PoisonousBromineChan

请问您这边跑的时候表格识别速度正常吗?

表格我还没验证过,有时间我试试看

234687552 commented 3 weeks ago

问题描述:

参考server.py使用LitServe调用,发现表格识别巨慢

系统&环境:

PRETTY_NAME="Ubuntu 24.04 LTS"

Python 3.10.14

magic-pdf version 0.7.1

paddlepaddle-gpu 3.0.0b1

magic-pdf.json配置

{
    "bucket_info":{
        "bucket-name-1":["ak", "sk", "endpoint"],
        "bucket-name-2":["ak", "sk", "endpoint"]
    },
    "models-dir":"/opt/models",
    "device-mode":"cuda",
    "table-config": {
        "model": "TableMaster",
        "is_table_recog_enable": true,
        "max_time": 400
    }
}

实验pdf链接:

https://github.com/opendatalab/MinerU/blob/master/demo/demo1.pdf

使用litserve

输出日志为:

2024-10-19 21:10:57.105 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 1501, cid_chars_radio: 0.0
2024-10-19 21:10:57.861 | INFO | magic_pdf.model.pdf_extract_kit:__call__:170 - layout detection cost: 0.68
Model initialization complete!
Setup complete for worker 3.

0: 1888x1344 4 embeddings, 92.2ms
Speed: 12.7ms preprocess, 92.2ms inference, 13.2ms postprocess per image at shape (1, 3, 1888, 1344)
2024-10-19 21:10:58.633 | INFO | magic_pdf.model.pdf_extract_kit:__call__:200 - formula nums: 4, mfr time: 0.2
2024-10-19 21:10:58.640 | INFO | magic_pdf.model.pdf_extract_kit:__call__:291 - ------------------table recognition processing begins-----------------
2024-10-19 21:14:13.524 | INFO | magic_pdf.model.pdf_extract_kit:__call__:300 - ------------table recognition processing ends within 194.88404989242554s-----
2024-10-19 21:14:13.525 | INFO | magic_pdf.model.pdf_extract_kit:__call__:317 - table cost: 194.89
2024-10-19 21:14:13.525 | INFO | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:124 - doc analyze cost: 196.3451521396637
2024-10-19 21:14:13.567 | INFO | magic_pdf.pdf_parse_union_core:pdf_parse_union:221 - page_id: 0, last_page_cost_time: 0.0
2024-10-19 21:14:13.663 | INFO | magic_pdf.para.para_split_v2:__detect_list_lines:143 - 发现了列表,列表行数:[(0, 1)], [[0]]
2024-10-19 21:14:13.663 | INFO | magic_pdf.para.para_split_v2:__detect_list_lines:156 - 列表行的第0到第1行是列表
2024-10-19 21:14:13.797 | INFO | magic_pdf.pipe.UNIPipe:pipe_mk_markdown:48 - uni_pipe mk mm_markdown finished
2024-10-19 21:14:13.805 | INFO | magic_pdf.pipe.UNIPipe:pipe_mk_uni_format:43 - uni_pipe mk content list finished
2024-10-19 21:14:13.805 | INFO | magic_pdf.tools.common:do_parse:119 - local output dir is /tmp/91dc2fda-fb5c-431f-bbce-9dcdc8ce3596/auto

使用命令行

/opt/mineru_venv/bin/magic-pdf -p origin.pdf -m auto

输出日志为:

[10/19 21:41:53 d2.checkpoint.detection_checkpoint]: [DetectionCheckpointer] Loading from /opt/models/Layout/model_final.pth ...
[10/19 21:41:53 fvcore.common.checkpoint]: [Checkpointer] Loading from /opt/models/Layout/model_final.pth ...
2024-10-19 21:41:56.518 | INFO     | magic_pdf.model.pdf_extract_kit:__init__:159 - DocAnalysis init done!
2024-10-19 21:41:56.518 | INFO     | magic_pdf.model.doc_analyze_by_custom_model:custom_model_init:98 - model init cost: 21.35542368888855
2024-10-19 21:41:57.207 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:170 - layout detection cost: 0.61

0: 1888x1344 4 embeddings, 91.9ms
Speed: 9.7ms preprocess, 91.9ms inference, 1.1ms postprocess per image at shape (1, 3, 1888, 1344)
2024-10-19 21:41:57.948 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:200 - formula nums: 4, mfr time: 0.19
2024-10-19 21:41:57.956 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:291 - ------------------table recognition processing begins-----------------
[2024/10/19 21:41:59] ppocr DEBUG: dt_boxes num : 18, elapse : 0.045398712158203125
[2024/10/19 21:41:59] ppocr DEBUG: dt_boxes num : 18, elapse : 0.045398712158203125
[2024/10/19 21:41:59] ppocr DEBUG: rec_res num  : 18, elapse : 0.047318220138549805
[2024/10/19 21:41:59] ppocr DEBUG: rec_res num  : 18, elapse : 0.047318220138549805
2024-10-19 21:41:59.425 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:300 - ------------table recognition processing ends within 1.4687747955322266s-----
2024-10-19 21:41:59.425 | INFO     | magic_pdf.model.pdf_extract_kit:__call__:317 - table cost: 1.47
2024-10-19 21:41:59.425 | INFO     | magic_pdf.model.doc_analyze_by_custom_model:doc_analyze:124 - doc analyze cost: 2.828835964202881
2024-10-19 21:41:59.467 | INFO     | magic_pdf.pdf_parse_union_core:pdf_parse_union:221 - page_id: 0, last_page_cost_time: 0.0
2024-10-19 21:42:00.020 | INFO     | magic_pdf.para.para_split_v2:__detect_list_lines:143 - 发现了列表,列表行数:[(0, 1)], [[0]]
2024-10-19 21:42:00.020 | INFO     | magic_pdf.para.para_split_v2:__detect_list_lines:156 - 列表行的第0到第1行是列表
2024-10-19 21:42:00.154 | INFO     | magic_pdf.pipe.UNIPipe:pipe_mk_markdown:48 - uni_pipe mk mm_markdown finished
2024-10-19 21:42:00.162 | INFO     | magic_pdf.pipe.UNIPipe:pipe_mk_uni_format:43 - uni_pipe mk content list finished
2024-10-19 21:42:00.162 | INFO     | magic_pdf.tools.common:do_parse:119 - local output dir is output/origin/auto
234687552 commented 3 weeks ago

不知道是不是这里导致表格识别巨慢

https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/ppTableModel.py#L46

IMG20241022-111905

myhloli commented 3 weeks ago

不知道是不是这里导致表格识别巨慢

https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/ppTableModel.py#L46

IMG20241022-111905

确实是这个原因,里面写死了匹配的规则,我们修一下这里 目前可以临时修改成

use_gpu = True if device.startswith("cuda") else False
234687552 commented 3 weeks ago

问题描述:

参考server.py提供接口,15并发4gpu压测,发现gpu[0]总是爆满,其他gpu都是相对空闲。

期望结果:

gpu的压力均分

实验过程执行:

nvidia-smi --loop=1

输出日志:


+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Wed Oct 23 19:59:02 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   68C    P0            228W /  350W |   19876MiB /  46068MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   50C    P0            146W /  350W |    9629MiB /  46068MiB |     38%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   45C    P0            154W /  350W |    9629MiB /  46068MiB |     46%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   45C    P0             90W /  350W |    9629MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Wed Oct 23 19:59:04 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   68C    P0            246W /  350W |   20234MiB /  46068MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   51C    P0            155W /  350W |    9629MiB /  46068MiB |     43%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   43C    P0            130W /  350W |    9629MiB /  46068MiB |      5%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   45C    P0             93W /  350W |    9629MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Wed Oct 23 19:59:05 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   68C    P0            217W /  350W |   20234MiB /  46068MiB |    100%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   50C    P0            158W /  350W |    9629MiB /  46068MiB |     34%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   43C    P0             88W /  350W |    9629MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   45C    P0             90W /  350W |    9629MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

image-20241023200204846

randydl commented 3 weeks ago

@234687552 你这边是打开了表格识别了吗,如果打开了可以试试关闭表格识别,再测一下负载均衡,这样可以定位是不是表格识别的问题。

randydl commented 3 weeks ago

感谢,跑通了。额外安装库 pip install python-multipart,然后启动服务器程序就请求成功了。 另外如果希望仅仅输出.md文件来节省存储空间和速度的话可以: from magic_pdf.libs.MakeContentConfig import MakeMode # 添加这行

修改do parse 函数:

        do_parse(self.output_dir,
                  pdf_name, inputs[0],
                    [],
                    **inputs[1],
                    f_draw_span_bbox=False,
                    f_draw_layout_bbox=False,
                    f_dump_md=True,
                    f_dump_middle_json=False,
                    f_dump_model_json=False,
                    f_dump_orig_pdf=False,
                    f_dump_content_list=False,
                    f_make_md_mode=MakeMode.MM_MD,
                    f_draw_model_bbox=False)

简单的方法是在调用client里面的do_parse函数时传入这些参数就可以了,不需要修改server的代码

234687552 commented 3 weeks ago

@234687552 你这边是打开了表格识别了吗,如果打开了可以试试关闭表格识别,再测一下负载均衡,这样可以定位是不是表格识别的问题。

情况描述

之前是开启了表格识别:"is_table_recog_enable": true,

关闭后测试:gpu[0] 不会一直持续爆满,其他gpu相对均衡运转

关闭表格识别

cat ~/magic-pdf.json

{
    "bucket_info":{
        "bucket-name-1":["ak", "sk", "endpoint"],
        "bucket-name-2":["ak", "sk", "endpoint"]
    },
    "models-dir":"/opt/models",
    "device-mode":"cuda",
    "table-config": {
        "model": "TableMaster",
        "is_table_recog_enable": false,
        "max_time": 400
    }
}

gpu使用情况

nvidia-smi --loop=1


+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 10:07:57 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   58C    P0            169W /  350W |   15238MiB /  46068MiB |     47%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   59C    P0            165W /  350W |    9627MiB /  46068MiB |     43%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   54C    P0            154W /  350W |    9627MiB /  46068MiB |     22%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   53C    P0            109W /  350W |    9619MiB /  46068MiB |     15%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 10:07:58 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   62C    P0            193W /  350W |   15238MiB /  46068MiB |     76%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   60C    P0            175W /  350W |    9627MiB /  46068MiB |     48%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   52C    P0            176W /  350W |    9627MiB /  46068MiB |     56%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   60C    P0            192W /  350W |    9629MiB /  46068MiB |     79%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 10:08:00 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   57C    P0            204W /  350W |   15238MiB /  46068MiB |     42%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   59C    P0            189W /  350W |    9627MiB /  46068MiB |     86%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   51C    P0            114W /  350W |    9627MiB /  46068MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   54C    P0            114W /  350W |    9629MiB /  46068MiB |     19%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

image-20241024101150096

234687552 commented 3 weeks ago

这边实际情况是必须开启表格识别的,现在不知道如何处理让表格识别也均衡单机使用多gpu

randydl commented 3 weeks ago

这边实际情况是必须开启表格识别的,现在不知道如何处理让表格识别也均衡单机使用多gpu

看来我的猜测是对的,还是因为表格识别的bug引起的,可能还是在代码的某个地方,表格模型还是以.cuda的方式load的,还是没有正确识别到cuda:1这种。导致所有的表格模型都load到了gpu 0上,因而gpu 0爆满。

randydl commented 3 weeks ago

对于TableMaster表格识别模型,以下是存在bug的地方: https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/ppTableModel.py#L55 仅仅改use_gpu = True if device == "cuda" else False是不够的,需要调查use_gpu变量

对于struct_eqtable表格模型,以下是存在bug的地方: https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/pek_sub_modules/structeqtable/StructTableModel.py#L9 这个bug应该好改,改成self.model = StructTable(self.model_path, self.max_new_tokens, self.max_time).to(device)应该就能生效

@myhloli @234687552

myhloli commented 3 weeks ago

对于TableMaster表格识别模型,以下是存在bug的地方: https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/ppTableModel.py#L55 仅仅改use_gpu = True if device == "cuda" else False是不够的,需要调查use_gpu变量

对于struct_eqtable表格模型,以下是存在bug的地方: https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/pek_sub_modules/structeqtable/StructTableModel.py#L9 这个bug应该好改,改成self.model = StructTable(self.model_path, self.max_new_tokens, self.max_time).to(device)应该就能生效

@myhloli @234687552

paddle框架指定gpu的方式和torch框架不一致,目前paddle都是使用第一张卡去加速的,目前我们的开发重心还在提高解析质量上,暂时分不出人力优化多卡分配的逻辑,欢迎有能力解决多卡分配问题的开发者提交pr

randydl commented 3 weeks ago

server.py

import os
import torch
import filetype
import json, uuid
import litserve as ls
from fastapi import HTTPException
from magic_pdf.tools.common import do_parse
from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton

class MinerUAPI(ls.LitAPI):
    def __init__(self, output_dir='/tmp'):
        self.output_dir = output_dir

    @staticmethod
    def clean_memory(device):
        import gc
        if torch.cuda.is_available():
            with torch.cuda.device(device):
                torch.cuda.empty_cache()
                torch.cuda.ipc_collect()
        gc.collect()

    def setup(self, device):
        device = torch.device(device)
        os.environ['CUDA_VISIBLE_DEVICES'] = str(device.index)
        model_manager = ModelSingleton()
        model_manager.get_model(True, False)
        model_manager.get_model(False, False)
        print(f'Model initialization complete!')

    def decode_request(self, request):
        file = request['file'].file.read()
        kwargs = json.loads(request['kwargs'])
        assert filetype.guess_mime(file) == 'application/pdf'
        return file, kwargs

    def predict(self, inputs):
        try:
            pdf_name = str(uuid.uuid4())
            do_parse(self.output_dir, pdf_name, inputs[0], [], **inputs[1])
            return pdf_name
        except Exception as e:
            raise HTTPException(status_code=500, detail=f'{e}')
        finally:
            self.clean_memory(self.device)

    def encode_response(self, response):
        return {'output_dir': response}

if __name__ == '__main__':
    server = ls.LitServer(MinerUAPI(), accelerator='gpu', devices=[0, 1], timeout=False)
    server.run(port=8000)

magic-pdf.json

{
    "bucket_info":{
        "bucket-name-1":["ak", "sk", "endpoint"],
        "bucket-name-2":["ak", "sk", "endpoint"]
    },
    "models-dir":"/opt/models",
    "device-mode":"cuda",
    "table-config": {
        "model": "TableMaster",
        "is_table_recog_enable": true,
        "max_time": 400
    }
}

试试把server.py改成我提供的新的代码,打开表格识别,再跑一次压测看看,应该是可以了 @234687552

234687552 commented 3 weeks ago

server.py

import os
import torch
import filetype
import json, uuid
import litserve as ls
from fastapi import HTTPException
from magic_pdf.tools.common import do_parse
from magic_pdf.model.doc_analyze_by_custom_model import ModelSingleton

class MinerUAPI(ls.LitAPI):
    def __init__(self, output_dir='/tmp'):
        self.output_dir = output_dir

    @staticmethod
    def clean_memory(device):
        import gc
        if torch.cuda.is_available():
            with torch.cuda.device(device):
                torch.cuda.empty_cache()
                torch.cuda.ipc_collect()
        gc.collect()

    def setup(self, device):
        device = torch.device(device)
        os.environ['CUDA_VISIBLE_DEVICES'] = str(device.index)
        model_manager = ModelSingleton()
        model_manager.get_model(True, False)
        model_manager.get_model(False, False)
        print(f'Model initialization complete!')

    def decode_request(self, request):
        file = request['file'].file.read()
        kwargs = json.loads(request['kwargs'])
        assert filetype.guess_mime(file) == 'application/pdf'
        return file, kwargs

    def predict(self, inputs):
        try:
            pdf_name = str(uuid.uuid4())
            do_parse(self.output_dir, pdf_name, inputs[0], [], **inputs[1])
            return pdf_name
        except Exception as e:
            raise HTTPException(status_code=500, detail=f'{e}')
        finally:
            self.clean_memory(self.device)

    def encode_response(self, response):
        return {'output_dir': response}

if __name__ == '__main__':
    server = ls.LitServer(MinerUAPI(), accelerator='gpu', devices=[0, 1], timeout=False)
    server.run(port=8000)

magic-pdf.json

{
    "bucket_info":{
        "bucket-name-1":["ak", "sk", "endpoint"],
        "bucket-name-2":["ak", "sk", "endpoint"]
    },
    "models-dir":"/opt/models",
    "device-mode":"cuda",
    "table-config": {
        "model": "TableMaster",
        "is_table_recog_enable": true,
        "max_time": 400
    }
}

试试把server.py改成我提供的新的代码,打开表格识别,再跑一次压测看看,应该是可以了 @234687552

情况描述 @randydl

gpu是均衡分配占用【详看后面的日志和截图】,但是clean_memory有异常堆栈

参考改动如下:

  def setup(self, device):
        device = torch.device(device)
        os.environ['CUDA_VISIBLE_DEVICES'] = str(device.index)
        model_manager = ModelSingleton()
        model_manager.get_model(True, False)
        model_manager.get_model(False, False)
        print(f'Model initialization complete!')

异常堆栈:

Please check the error trace for more details.
Traceback (most recent call last):
File "/opt/mineru_venv/lib/python3.10/site-packages/litserve/loops.py", line 134, in run_single_loop
y = _inject_context(
File "/opt/mineru_venv/lib/python3.10/site-packages/litserve/loops.py", line 55, in _inject_context
return func(*args, **kwargs)
File "/app/app.py", line 144, in predict
self.clean_memory(self.device)
File "/app/app.py", line 83, in clean_memory
with torch.cuda.device(device):
File "/opt/mineru_venv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 365, in __enter__
self.prev_idx = torch.cuda._exchange_device(self.idx)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

gpu使用情况

nvidia-smi --loop=1


+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 20:54:03 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   51C    P0            135W /  350W |   11611MiB /  46068MiB |     18%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   54C    P0            124W /  350W |   11435MiB /  46068MiB |     23%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   48C    P0            112W /  350W |   12227MiB /  46068MiB |     20%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   51C    P0            124W /  350W |   11435MiB /  46068MiB |     26%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 20:54:05 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   51C    P0            117W /  350W |   11611MiB /  46068MiB |     23%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   54C    P0            130W /  350W |   11435MiB /  46068MiB |     27%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   48C    P0            118W /  350W |   12227MiB /  46068MiB |     23%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   51C    P0            132W /  350W |   11435MiB /  46068MiB |     31%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
Thu Oct 24 20:54:06 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.12              Driver Version: 550.90.12      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L20                     On  |   00000000:D3:00.0 Off |                    0 |
| N/A   51C    P0            125W /  350W |   11611MiB /  46068MiB |     27%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA L20                     On  |   00000000:D4:00.0 Off |                    0 |
| N/A   54C    P0            138W /  350W |   11435MiB /  46068MiB |     30%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA L20                     On  |   00000000:D6:00.0 Off |                    0 |
| N/A   48C    P0            126W /  350W |   12227MiB /  46068MiB |     27%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA L20                     On  |   00000000:D7:00.0 Off |                    0 |
| N/A   52C    P0            143W /  350W |   11435MiB /  46068MiB |     36%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

image-20241024205743118

randydl commented 3 weeks ago

感谢,看来有进展!试试把with torch.cuda.device(device):这句话删掉@234687552

234687552 commented 3 weeks ago

感谢,看来有进展!试试把with torch.cuda.device(device):这句话删掉@234687552

感谢支持,现在是可以多gpu正常运作了。

randydl commented 3 weeks ago

对于TableMaster表格识别模型,以下是存在bug的地方: https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/ppTableModel.py#L55 仅仅改use_gpu = True if device == "cuda" else False是不够的,需要调查use_gpu变量 对于struct_eqtable表格模型,以下是存在bug的地方: https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/pek_sub_modules/structeqtable/StructTableModel.py#L9 这个bug应该好改,改成self.model = StructTable(self.model_path, self.max_new_tokens, self.max_time).to(device)应该就能生效 @myhloli @234687552

paddle框架指定gpu的方式和torch框架不一致,目前paddle都是使用第一张卡去加速的,目前我们的开发重心还在提高解析质量上,暂时分不出人力优化多卡分配的逻辑,欢迎有能力解决多卡分配问题的开发者提交pr

对于TableMaster表格识别模型,以下是存在bug的地方: https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/ppTableModel.py#L55 仅仅改use_gpu = True if device == "cuda" else False是不够的,需要调查use_gpu变量 对于struct_eqtable表格模型,以下是存在bug的地方: https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/pek_sub_modules/structeqtable/StructTableModel.py#L9 这个bug应该好改,改成self.model = StructTable(self.model_path, self.max_new_tokens, self.max_time).to(device)应该就能生效 @myhloli @234687552

paddle框架指定gpu的方式和torch框架不一致,目前paddle都是使用第一张卡去加速的,目前我们的开发重心还在提高解析质量上,暂时分不出人力优化多卡分配的逻辑,欢迎有能力解决多卡分配问题的开发者提交pr

经过昨天的调试我们基本解决了,后续我再测一下,可以的话我提个PR

myhloli commented 3 weeks ago

@randydl 可以提到dev分支的project目录,参考其他项目创建一个目录放代码文件和readme

randydl commented 3 weeks ago

@randydl 可以提到dev分支的project目录,参考其他项目创建一个目录放代码文件和readme

好的

myhloli commented 1 week ago

https://github.com/opendatalab/MinerU/tree/master/projects/multi_gpu

Sakura4036 commented 1 week ago

@myhloli @234687552 你好,麻烦请看看这个pip安装问题 @897