opendatalab / MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
https://opendatalab.com/OpenSourceTools?tool=extract
GNU Affero General Public License v3.0
18.23k stars 1.31k forks source link

批量测试 #1035

Closed lyc728 closed 3 days ago

lyc728 commented 3 days ago

你好 我这边发现,循环推理的话,模型会反复初始化,这样对于创建服务是不友好的,请问有计划进行更新吗

myhloli commented 3 days ago

循环推理不会重复加载模型的,可以自行查看日志记录确认

lyc728 commented 3 days ago

每个文件都不一样,你每次加载不同pdf,进行推理,模型是要初始化的呀 model_input = { "ocr": ocr, "show_log": show_log, "models_dir": local_models_dir, "device": device, "table_config": table_config, "layout_config": layout_config, "formula_config": formula_config, "lang": lang, }

        custom_model = CustomPEKModel(**model_input)

https://github.com/opendatalab/MinerU/blob/master/magic_pdf/model/doc_analyze_by_custom_model.py

myhloli commented 3 days ago

是否重复初始化可以自行通过日志确认