Closed yingliu0518 closed 5 hours ago
可以找个有网的电脑跑下模型下载的脚本,脚本跑完后会在log中输出模型缓存目录,直接拷贝模型目录和自动生成的配置文件到你的内网设备的用户目录即可
[11/21 16:24:33 fvcore.common.checkpoint]: [Checkpointer] Loading from d:\code\MinerU-master\MinerU-master\model\Layout/LayoutLMv3/model_final.pth ... download https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar to C:\Users*****/.paddleocr/whl\det\ch\ch_PP-OCRv4_det_infer\ch_PP-OCRv4_det_infer.tar 2024-11-21 16:25:06.063 | ERROR | main:pdf_parse_main:140 - HTTPSConnectionPool(host='paddleocr.bj.bcebos.com', port=443): Max retries exceeded with url: /PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1007)')))
为什么这里还会下载这个ch_PP-OCRv4_det_infer呀?我记得在models-dir这个文件夹中有这个模型。
[11/21 16:24:33 fvcore.common.checkpoint]: [Checkpointer] Loading from d:\code\MinerU-master\MinerU-master\model\Layout/LayoutLMv3/model_final.pth ... download https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar to C:\Users*****/.paddleocr/whl\det\ch\ch_PP-OCRv4_det_infer\ch_PP-OCRv4_det_infer.tar 2024-11-21 16:25:06.063 | ERROR | main:pdf_parse_main:140 - HTTPSConnectionPool(host='paddleocr.bj.bcebos.com', port=443): Max retries exceeded with url: /PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1007)')))
为什么这里还会下载这个ch_PP-OCRv4_det_infer呀?我记得在models-dir这个文件夹中有这个模型。
我手动下载ch_PP-OCRv4_det_infer.tar,然后移动到C:\Users*****/.paddleocr/whl\det\ch\ch_PP-OCRv4_det_infer\ch_PP-OCRv4_det_infer.tar也解决不了问题
[11/21 16:24:33 fvcore.common.checkpoint]: [Checkpointer] Loading from d:\code\MinerU-master\MinerU-master\model\Layout/LayoutLMv3/model_final.pth ... download https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar to C:\Users*****/.paddleocr/whl\det\ch\ch_PP-OCRv4_det_infer\ch_PP-OCRv4_det_infer.tar 2024-11-21 16:25:06.063 | ERROR | main:pdf_parse_main:140 - HTTPSConnectionPool(host='paddleocr.bj.bcebos.com', port=443): Max retries exceeded with url: /PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1007)')))
为什么这里还会下载这个ch_PP-OCRv4_det_infer呀?我记得在models-dir这个文件夹中有这个模型。
模型仓库里的是kit开发同事使用的,mineru使用的是原版paddle模型加载逻辑
tar包需要解压成 https://huggingface.co/spaces/opendatalab/MinerU/tree/main/paddleocr 的样式,并将整个paddleocr
目录复制到用户目录下的.paddleocr
目录
因为公司网络无法使用脚本从huggingface或ModelScope下载模型权重。
于是我手动下载了模型权重。请问这一行的模型是什么? "layoutreader-model-dir":"/tmp/layoutreader"
{ "bucket_info":{ "bucket-name-1":["ak", "sk", "endpoint"], "bucket-name-2":["ak", "sk", "endpoint"] }, "models-dir":"/tmp/models", "layoutreader-model-dir":"/tmp/layoutreader", "device-mode":"cpu", "layout-config": { "model": "layoutlmv3" }, "formula-config": { "mfd_model": "yolo_v8_mfd", "mfr_model": "unimernet_small", "enable": true }, "table-config": { "model": "rapid_table", "enable": false, "max_time": 400 }, "config_version": "1.0.0" } 我的报错信息如下: 2024-11-21 14:53:43.483 | ERROR | magic_pdf.user_api:parse_pdf:97 - C:\Users**.cache/modelscope/hub/ppaanngggg/layoutreader does not appear to have a file named config.json. Checkout 'https://huggingface.co/C:\Users\l**.cache/modelscope/hub/ppaanngggg/layoutreader/tree/main' for available files.