opendatalab / MinerU

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
https://opendatalab.com/OpenSourceTools
GNU Affero General Public License v3.0
11.46k stars 859 forks source link

demo报错 #643

Open tqangxl opened 3 hours ago

tqangxl commented 3 hours ago

Description of the bug | 错误描述

PS C:\Users\James> conda activate MinerU (MinerU) PS C:\Users\James> pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com Looking in indexes: https://mirrors.aliyun.com/pypi/simple, https://wheels.myhloli.com Collecting magic-pdf[full] Using cached https://mirrors.aliyun.com/pypi/packages/00/5b/5157586376edf6bed6dd0f234423f89ecb01ce10adc6cc608e2e1a935280/magic_pdf-0.8.1-py3-none-any.whl (1.1 MB) Collecting boto3>=1.28.43 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/5a/d2/3e0071e8ca4ceec9c9199b5cccec570930f77d0a20aba6c0d352eeffd6c8/boto3-1.35.24-py3-none-any.whl (139 kB) Collecting Brotli>=1.1.0 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/f7/65/b785722e941193fd8b571afd9edbec2a9b838ddec4375d8af33a50b8dab9/Brotli-1.1.0-cp310-cp310-win_amd64.whl (357 kB) Collecting click>=8.1.7 (from magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/00/2e/d53fa4befbf2cfa713304affc7ca780ce4fc1fd8710527771b58311a3229/click-8.1.7-py3-none-any.whl (97 kB) Collecting fast-langdetect==0.2.0 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/d0/99/9cb2230dbdc5697b7d6cce86eec3397a80a2c877c400059fb49a79c48546/fast_langdetect-0.2.0-py3-none-any.whl (6.4 kB) Collecting loguru>=0.6.0 (from magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/03/0a/4f6fed21aa246c6b49b561ca55facacc2a44b87d65b8b92362a8e99ba202/loguru-0.7.2-py3-none-any.whl (62 kB) Collecting numpy<2.0.0,>=1.21.6 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/19/77/538f202862b9183f54108557bfda67e17603fc560c384559e769321c9d92/numpy-1.26.4-cp310-cp310-win_amd64.whl (15.8 MB) Collecting pdfminer.six==20231228 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/eb/9c/e46fe7502b32d7db6af6e36a9105abb93301fa1ec475b5ddcba8b35ae23a/pdfminer.six-20231228-py3-none-any.whl (5.6 MB) Collecting pydantic<2.8.0,>=2.7.2 (from magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/17/ba/1b65c9cbc49e0c7cd1be086c63209e9ad883c2a409be4746c21db4263f41/pydantic-2.7.4-py3-none-any.whl (409 kB) Collecting PyMuPDF>=1.24.9 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/1c/46/fd0764b2195b02ccca0ae1617f3086d967dcb6b3dbc9e05b0be262d4e050/PyMuPDF-1.24.10-cp310-none-win_amd64.whl (3.2 MB) Collecting scikit-learn>=1.0.2 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/48/76/154ebda6794faf0b0f3ccb1b5cd9a19f0a63cb9e1f3d2c61b6114002677b/scikit_learn-1.5.2-cp310-cp310-win_amd64.whl (11.0 MB) Collecting wordninja>=2.0.0 (from magic-pdf[full]) Using cached wordninja-2.0.0-py3-none-any.whl Collecting unimernet==0.1.6 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/0c/1d/5847f9237c695efae828fea23b4db8bc51419804f116c9156ab0f557377a/unimernet-0.1.6-py3-none-any.whl (2.2 MB) Collecting ultralytics (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/9d/43/e06614df3a52763c7522097b6f0fd4c035fea201c4616c60c852259cd98c/ultralytics-8.2.98-py3-none-any.whl (873 kB) Collecting paddleocr==2.7.3 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/f2/55/0469ebca1d9c581a3fa740621afe96461a0ef450e489e10e278cc17a19ef/paddleocr-2.7.3-py3-none-any.whl (780 kB) Collecting pypandoc (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/fc/09/91ab02feebc195a39ce0a39edcafbe866e69ff700a59790e605b3d5f69b1/pypandoc-1.13-py3-none-any.whl (21 kB) Collecting struct-eqtable==0.1.0 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/a4/25/d1e91b2ad2727c9ecb332607729a03c2f0f345afd2547f4100e543330f0e/struct_eqtable-0.1.0-py3-none-any.whl (8.5 kB) Collecting detectron2 (from magic-pdf[full]) Using cached https://wheels-1251341229.cos.ap-shanghai.myqcloud.com/assets/whl/detectron2/detectron2-0.6-cp310-cp310-win_amd64.whl (884 kB) Collecting matplotlib<=3.9.0 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/b9/55/6138ad64c789bad13d18e0240da75e73dbd364fdc0aa670fff87a5eef5ab/matplotlib-3.9.0-cp310-cp310-win_amd64.whl (8.0 MB) Collecting paddlepaddle==2.6.1 (from magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/c7/d0/a48695cfc5e452621256631ab87583268b0e3f27a91e49acf84a1ed32212/paddlepaddle-2.6.1-cp310-cp310-win_amd64.whl (81.0 MB) Collecting fasttext-wheel>=0.9.2 (from fast-langdetect==0.2.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/1f/08/9a495ba568887cb5f9022b600806beb39754ff0870d779b4976e039fdb24/fasttext_wheel-0.9.2-cp310-cp310-win_amd64.whl (241 kB) Collecting robust-downloader>=0.0.2 (from fast-langdetect==0.2.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/56/a1/779e9d0ebbdc704411ce30915a1105eb01aeaa9e402d7e446613ff8fb121/robust_downloader-0.0.2-py3-none-any.whl (15 kB) Collecting langdetect>=1.0.9 (from fast-langdetect==0.2.0->magic-pdf[full]) Using cached langdetect-1.0.9-py3-none-any.whl Collecting shapely (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/cd/4c/6f4a6fc085e3be01c4c9de0117a2d373bf9fec5f0426cf4d5c94090a5a4d/shapely-2.0.6-cp310-cp310-win_amd64.whl (1.4 MB) Collecting scikit-image (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/99/89/3fcd68d034db5d29c974e964d03deec9d0fbf9410ff0a0b95efff70947f6/scikit_image-0.24.0-cp310-cp310-win_amd64.whl (12.9 MB) Collecting imgaug (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/66/b1/af3142c4a85cba6da9f4ebb5ff4e21e2616309552caca5e8acefe9840622/imgaug-0.4.0-py2.py3-none-any.whl (948 kB) Collecting pyclipper (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/60/61/354f484ab7969a601327646bbaeb1b799508b4e81946ea4d52bbf9d779c6/pyclipper-1.3.0.post5-cp310-cp310-win_amd64.whl (108 kB) Collecting lmdb (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/64/ca/5e8ed72930c410eedd25d801c8abfd6cbf65bac0461128fbdd03358b279e/lmdb-1.5.1-cp310-cp310-win_amd64.whl (100 kB) Collecting tqdm (from paddleocr==2.7.3->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/48/5d/acf5905c36149bbaec41ccf7f2b68814647347b72075ac0b1fe3022fdc73/tqdm-4.66.5-py3-none-any.whl (78 kB) Collecting visualdl (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/ea/b5/37726c750a4f4598660998327c3566b2d2ed5a1a5f44e9f0dde875602447/visualdl-2.5.3-py3-none-any.whl (6.3 MB) Collecting rapidfuzz (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/0e/1c/778e96d260990e1e2c1efb4a6e0f74f8f019959a80992cf50421b0472b7e/rapidfuzz-3.9.7-cp310-cp310-win_amd64.whl (1.7 MB) Collecting opencv-python<=4.6.0.66 (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/cf/09/b24c266cd61ddeed101b90c92a26f54d060b06f4a1b102eb891576d6e9e2/opencv_python-4.6.0.66-cp36-abi3-win_amd64.whl (35.6 MB) Collecting opencv-contrib-python<=4.6.0.66 (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/63/0b/6ef1acbaa21e5245c85a42f9f0ecfaf2e7420b24615a00f0eee170328e6b/opencv_contrib_python-4.6.0.66-cp36-abi3-win_amd64.whl (42.5 MB) Collecting cython (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/f9/de/19fdd1c7a52e0534bf5f544e0346c15d71d20338dbd013117f763b94613f/Cython-3.0.11-cp310-cp310-win_amd64.whl (2.8 MB) Collecting lxml (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/a1/35/183d32551447e280032b2331738cd850da435a42f850b71ebeaab42c1313/lxml-5.3.0-cp310-cp310-win_amd64.whl (3.8 MB) Collecting premailer (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/b1/07/4e8d94f94c7d41ca5ddf8a9695ad87b888104e2fd41a35546c1dc9ca74ac/premailer-3.10.0-py2.py3-none-any.whl (19 kB) Collecting openpyxl (from paddleocr==2.7.3->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/c0/da/977ded879c29cbd04de313843e76868e6e13408a94ed6b987245dc7c8506/openpyxl-3.1.5-py2.py3-none-any.whl (250 kB) Collecting attrdict (from paddleocr==2.7.3->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/ef/97/28fe7e68bc7adfce67d4339756e85e9fcf3c6fd7f0c0781695352b70472c/attrdict-2.0.1-py2.py3-none-any.whl (9.9 kB) Collecting Pillow>=10.0.0 (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/f4/72/0203e94a91ddb4a9d5238434ae6c1ca10e610e8487036132ea9bf806ca2a/pillow-10.4.0-cp310-cp310-win_amd64.whl (2.6 MB) Collecting pyyaml (from paddleocr==2.7.3->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/b5/84/0fa4b06f6d6c958d207620fc60005e241ecedceee58931bb20138e1e5776/PyYAML-6.0.2-cp310-cp310-win_amd64.whl (161 kB) Collecting python-docx (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/3e/3d/330d9efbdb816d3f60bf2ad92f05e1708e4a1b9abe80461ac3444c83f749/python_docx-1.1.2-py3-none-any.whl (244 kB) Collecting beautifulsoup4 (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/b1/fe/e8c672695b37eecc5cbf43e1d0638d88d66ba3a44c4d321c796f4e59167f/beautifulsoup4-4.12.3-py3-none-any.whl (147 kB) Collecting fonttools>=4.24.0 (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/70/11/7b81b12a5614b5d237ab70c38bdc268de3eb3880ce7bb1269122e0a415ea/fonttools-4.53.1-cp310-cp310-win_amd64.whl (2.2 MB) Collecting fire>=0.3.0 (from paddleocr==2.7.3->magic-pdf[full]) Using cached fire-0.6.0-py2.py3-none-any.whl Collecting pdf2docx (from paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/b5/f9/6d567df395c0409baf2b4dd9cd30d1e977c70672fe7ec2a684af1e6aa41c/pdf2docx-0.5.8-py3-none-any.whl (132 kB) Collecting httpx (from paddlepaddle==2.6.1->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/56/95/9377bcb415797e44274b51d46e3249eba641711cf3348050f76ee7b15ffc/httpx-0.27.2-py3-none-any.whl (76 kB) Collecting decorator (from paddlepaddle==2.6.1->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/d5/50/83c593b07763e1161326b3b8c6686f0f4b0f24d5526546bee538c89837d6/decorator-5.1.1-py3-none-any.whl (9.1 kB) Collecting astor (from paddlepaddle==2.6.1->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/c3/88/97eef84f48fa04fbd6750e62dcceafba6c63c81b7ac1420856c8dcc0a3f9/astor-0.8.1-py2.py3-none-any.whl (27 kB) Collecting opt-einsum==3.3.0 (from paddlepaddle==2.6.1->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/bc/19/404708a7e54ad2798907210462fd950c3442ea51acc8790f3da48d2bee8b/opt_einsum-3.3.0-py3-none-any.whl (65 kB) Collecting protobuf<=3.20.2,>=3.1.0 (from paddlepaddle==2.6.1->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/39/f3/393c00e45439a46f293077da5b0362a1a4d04b2c8242c35a763f03e8e742/protobuf-3.20.2-cp310-cp310-win_amd64.whl (904 kB) Collecting charset-normalizer>=2.0.0 (from pdfminer.six==20231228->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/a2/a0/4af29e22cb5942488cf45630cbdd7cefd908768e69bdd90280842e4e8529/charset_normalizer-3.3.2-cp310-cp310-win_amd64.whl (100 kB) Collecting cryptography>=36.0.0 (from pdfminer.six==20231228->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/b3/c6/c09cee6968add5ff868525c3815e5dccc0e3c6e89eec58dc9135d3c40e88/cryptography-43.0.1-cp39-abi3-win_amd64.whl (3.1 MB) Collecting torch (from struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/5d/4c/b2a59ff0e265f5ee154f0d81e948b1518b94f545357731e1a3245ee5d45b/torch-2.4.1-cp310-cp310-win_amd64.whl (199.4 MB) Collecting transformers (from struct-eqtable==0.1.0->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/75/35/07c9879163b603f0e464b0f6e6e628a2340cfc7cdc5ca8e7d52d776710d4/transformers-4.44.2-py3-none-any.whl (9.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.5/9.5 MB 21.0 MB/s eta 0:00:00 Collecting albumentations<2.0.0,>=1.4.4 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/2f/38/2011d955445737eb91d6e69c4309637b360f696040943b31b7a44ee6cc88/albumentations-1.4.15-py3-none-any.whl (200 kB) Collecting eva-decord<0.7.0,>=0.6.1 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/a2/1c/ed339bb652fe2eb8ecd36e9b56d9b694c255d6f2d2a6f1d3bd9b195f2cdc/eva_decord-0.6.1-py3-none-win_amd64.whl (25.5 MB) Collecting evaluate<0.5.0,>=0.4.1 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/a2/e7/cbca9e2d2590eb9b5aa8f7ebabe1beb1498f9462d2ecede5c9fd9735faaf/evaluate-0.4.3-py3-none-any.whl (84 kB) Collecting fairscale<0.5.0,>=0.4.13 (from unimernet==0.1.6->magic-pdf[full]) Using cached fairscale-0.4.13-py3-none-any.whl Collecting ftfy<7.0.0,>=6.2.0 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/ed/46/14d230ad057048aea7ccd2f96a80905830866d281ea90a6662a825490659/ftfy-6.2.3-py3-none-any.whl (43 kB) Collecting iopath<0.2.0,>=0.1.9 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/72/73/b3d451dfc523756cf177d3ebb0af76dc7751b341c60e2a21871be400ae29/iopath-0.1.10.tar.gz (42 kB) Preparing metadata (setup.py) ... done Collecting omegaconf<3.0.0,>=2.3.0 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/e3/94/1843518e420fa3ed6919835845df698c7e27e183cb997394e4a670973a65/omegaconf-2.3.0-py3-none-any.whl (79 kB) Collecting timm<0.10.0,>=0.9.16 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/68/99/2018622d268f6017ddfa5ee71f070bad5d07590374793166baa102849d17/timm-0.9.16-py3-none-any.whl (2.2 MB) Collecting torch (from struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/85/fc/ee5bb50eff313149657f173b003649677e27fa3aaae1ecc806add37f017c/torch-2.3.1-cp310-cp310-win_amd64.whl (159.8 MB) Collecting torchtext<=0.18.0,>=0.17.2 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/b2/3d/6f18d551b00bf8babaa3a569d5fd62cba2bd7bbdeaf82167a959352ba56b/torchtext-0.18.0-cp310-cp310-win_amd64.whl (1.9 MB) Collecting torchvision<=0.18.1,>=0.17.2 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/4e/62/3816637079b77875077678bd7087285a5b5589664f94f5ceb2d080cc024c/torchvision-0.18.1-cp310-cp310-win_amd64.whl (1.2 MB) Collecting transformers (from struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/09/c8/844d5518a6aeb4ffdc0cf0cae65ae13dbe5838306728c5c640b5a6e2a0c9/transformers-4.40.0-py3-none-any.whl (9.0 MB) Collecting wand<0.7.0,>=0.6.13 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/59/d5/1bdd7c9662d5e9078e25ba0eb69bdb122859295746d40ab8dfef3a7b4d42/Wand-0.6.13-py2.py3-none-any.whl (143 kB) Collecting webdataset<0.3.0,>=0.2.86 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/8e/84/cf2319c375f4e061f27354685295905dc81105d2a2d2239baaf6f6e73c87/webdataset-0.2.100-py3-none-any.whl (74 kB) Collecting filelock (from transformers->struct-eqtable==0.1.0->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/b9/f8/feced7779d755758a52d1f6635d990b8d98dc0a29fa568bbe0625f18fdf3/filelock-3.16.1-py3-none-any.whl (16 kB) Collecting huggingface-hub<1.0,>=0.19.3 (from transformers->struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/d5/ce/1f8e61cd63175cc2e79233b954b1c4e85363c788fb3a1fa23c87a25c9b81/huggingface_hub-0.25.0-py3-none-any.whl (436 kB) Collecting packaging>=20.0 (from transformers->struct-eqtable==0.1.0->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/08/aa/cc0199a5f0ad350994d660967a8efb233fe0416e4639146c089643407ce6/packaging-24.1-py3-none-any.whl (53 kB) Collecting regex!=2019.12.17 (from transformers->struct-eqtable==0.1.0->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/c4/2a/4f9c47d9395b6aff24874c761d8d620c0232f97c43ef3cf668c8b355e7a7/regex-2024.9.11-cp310-cp310-win_amd64.whl (274 kB) Collecting requests (from transformers->struct-eqtable==0.1.0->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/f9/9b/335f9764261e915ed497fcdeb11df5dfd6f7bf257d4a6a2a686d80da4d54/requests-2.32.3-py3-none-any.whl (64 kB) Collecting tokenizers<0.20,>=0.19 (from transformers->struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/f4/85/d999b9a05fd101d48f1a365d68be0b109277bb25c89fb37a389d669f9185/tokenizers-0.19.1-cp310-none-win_amd64.whl (2.2 MB) Collecting safetensors>=0.4.1 (from transformers->struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/ba/f0/919c72a9eef843781e652d0650f2819039943e69b69d5af2d0451a23edc3/safetensors-0.4.5-cp310-none-win_amd64.whl (285 kB) Collecting botocore<1.36.0,>=1.35.24 (from boto3>=1.28.43->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/f5/84/e8a1220f2fcf06c68970c8ddfe0687cc4eb967c0ad219de5dfed65dd3958/botocore-1.35.24-py3-none-any.whl (12.6 MB) Collecting jmespath<2.0.0,>=0.7.1 (from boto3>=1.28.43->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/31/b4/b9b800c45527aadd64d5b442f9b932b00648617eb5d63d2c7a6587b7cafc/jmespath-1.0.1-py3-none-any.whl (20 kB) Collecting s3transfer<0.11.0,>=0.10.0 (from boto3>=1.28.43->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/3c/4a/b221409913760d26cf4498b7b1741d510c82d3ad38381984a3ddc135ec66/s3transfer-0.10.2-py3-none-any.whl (82 kB) Collecting colorama (from click>=8.1.7->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/d1/d6/3965ed04c63042e047cb6a3e6ed1a63a35087b6a609aa3a15ed8ac56c221/colorama-0.4.6-py2.py3-none-any.whl (25 kB) Collecting win32-setctime>=1.0.0 (from loguru>=0.6.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/0a/e6/a7d828fef907843b2a5773ebff47fb79ac0c1c88d60c0ca9530ee941e248/win32_setctime-1.1.0-py3-none-any.whl (3.6 kB) Collecting contourpy>=1.0.1 (from matplotlib<=3.9.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/a9/97/3f89bba79ff6ff2b07a3cbc40aa693c360d5efa90d66e914f0ff03b95ec7/contourpy-1.3.0-cp310-cp310-win_amd64.whl (216 kB) Collecting cycler>=0.10 (from matplotlib<=3.9.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/e7/05/c19819d5e3d95294a6f5947fb9b9629efb316b96de511b418c53d245aae6/cycler-0.12.1-py3-none-any.whl (8.3 kB) Collecting kiwisolver>=1.3.1 (from matplotlib<=3.9.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/12/ca/d0f7b7ffbb0be1e7c2258b53554efec1fd652921f10d7d85045aff93ab61/kiwisolver-1.4.7-cp310-cp310-win_amd64.whl (55 kB) Collecting pyparsing>=2.3.1 (from matplotlib<=3.9.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/e5/0c/0e3c05b1c87bb6a1c76d281b0f35e78d2d80ac91b5f8f524cebf77f51049/pyparsing-3.1.4-py3-none-any.whl (104 kB) Collecting python-dateutil>=2.7 (from matplotlib<=3.9.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/ec/57/56b9bcc3c9c6a792fcbaf139543cee77261f3651ca9da0c93f5c1221264b/python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB) Collecting annotated-types>=0.4.0 (from pydantic<2.8.0,>=2.7.2->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/78/b6/6307fbef88d9b5ee7421e68d78a9f162e0da4900bc5f5793f6d3d0e34fb8/annotated_types-0.7.0-py3-none-any.whl (13 kB) Collecting pydantic-core==2.18.4 (from pydantic<2.8.0,>=2.7.2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/5c/d8/13ac833cb5ec401fb69c5c21acc291dc54bf05749f3501bf17ffdcd79542/pydantic_core-2.18.4-cp310-none-win_amd64.whl (1.9 MB) Collecting typing-extensions>=4.6.1 (from pydantic<2.8.0,>=2.7.2->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/26/9f/ad63fc0248c5379346306f8668cda6e2e2e9c95e01216d2b8ffd9ff037d0/typing_extensions-4.12.2-py3-none-any.whl (37 kB) Collecting PyMuPDFb==1.24.10 (from PyMuPDF>=1.24.9->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/70/cb/8459d6c179befd7c6eee555334f054e9a6dcdd9f8671891e1da19e0ce526/PyMuPDFb-1.24.10-py3-none-win_amd64.whl (13.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.2/13.2 MB 20.2 MB/s eta 0:00:00 Collecting scipy>=1.6.0 (from scikit-learn>=1.0.2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/e7/1c/8daa6df17a945cb1a2a1e3bae3c49643f7b3b94017ff01a4787064f03f84/scipy-1.14.1-cp310-cp310-win_amd64.whl (44.8 MB) Collecting joblib>=1.2.0 (from scikit-learn>=1.0.2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/91/29/df4b9b42f2be0b623cbd5e2140cafcaa2bef0759a00b7b70104dcfe2fb51/joblib-1.4.2-py3-none-any.whl (301 kB) Collecting threadpoolctl>=3.1.0 (from scikit-learn>=1.0.2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/4b/2c/ffbf7a134b9ab11a67b0cf0726453cedd9c5043a4fe7a35d1cefa9a1bcfb/threadpoolctl-3.5.0-py3-none-any.whl (18 kB) Collecting pycocotools>=2.0.2 (from detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/8d/06/b9bdedfdcbf2fb5ba55252f1a5ff5e8e02ae204fe392f7b4f5babbc14a2a/pycocotools-2.0.8-cp310-cp310-win_amd64.whl (84 kB) Collecting termcolor>=1.1 (from detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/d9/5f/8c716e47b3a50cbd7c146f45881e11d9414def768b7cd9c5e6650ec2a80a/termcolor-2.4.0-py3-none-any.whl (7.7 kB) Collecting yacs>=0.1.8 (from detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/38/4f/fe9a4d472aa867878ce3bb7efb16654c5d63672b86dc0e6e953a67018433/yacs-0.1.8-py3-none-any.whl (14 kB) Collecting tabulate (from detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/40/44/4a5f08c96eb108af5cb50b41f76142f0afa346dfa99d5296fe7202a11854/tabulate-0.9.0-py3-none-any.whl (35 kB) Collecting cloudpickle (from detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/96/43/dae06432d0c4b1dc9e9149ad37b4ca8384cf6eb7700cd9215b177b914f0a/cloudpickle-3.0.0-py3-none-any.whl (20 kB) Collecting tensorboard (from detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/d4/41/dccba8c5f955bc35b6110ff78574e4e5c8226ad62f08e732096c3861309b/tensorboard-2.17.1-py3-none-any.whl (5.5 MB) Collecting fvcore<0.1.6,>=0.1.5 (from detectron2->magic-pdf[full]) Using cached fvcore-0.1.5.post20221221-py3-none-any.whl Collecting iopath<0.2.0,>=0.1.9 (from unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/af/20/65dd9bd25a1eb7fa35b5ae38d289126af065f8a0c1f6a90564f4bff0f89d/iopath-0.1.9-py3-none-any.whl (27 kB) Collecting hydra-core>=1.1 (from detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/c6/50/e0edd38dcd63fb26a8547f13d28f7a008bc4a3fd4eb4ff030673f22ad41a/hydra_core-1.3.2-py3-none-any.whl (154 kB) Collecting black (from detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/87/a0/6d2e4175ef364b8c4b64f8441ba041ed65c63ea1db2720d61494ac711c15/black-24.8.0-cp310-cp310-win_amd64.whl (1.4 MB) Collecting psutil (from ultralytics->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/73/44/561092313ae925f3acfaace6f9ddc4f6a9c748704317bad9c8c8f8a36a79/psutil-6.0.0-cp37-abi3-win_amd64.whl (257 kB) Collecting py-cpuinfo (from ultralytics->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/e0/a9/023730ba63db1e494a271cb018dcd361bd2c917ba7004c3e49d5daf795a2/py_cpuinfo-9.0.0-py3-none-any.whl (22 kB) Collecting pandas>=1.1.4 (from ultralytics->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/31/9e/6ebb433de864a6cd45716af52a4d7a8c3c9aaf3a98368e61db9e69e69a9c/pandas-2.2.3-cp310-cp310-win_amd64.whl (11.6 MB) Collecting seaborn>=0.11.0 (from ultralytics->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/83/11/00d3c3dfc25ad54e731d91449895a79e4bf2384dc3ac01809010ba88f6d5/seaborn-0.13.2-py3-none-any.whl (294 kB) Collecting ultralytics-thop>=2.0.0 (from ultralytics->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/12/3d/36ab0be2d46443a591979e4e1a025f18af43ffa07fb244fb5c7a07e82567/ultralytics_thop-2.0.6-py3-none-any.whl (26 kB) Collecting albucore>=0.0.15 (from albumentations<2.0.0,>=1.4.4->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/f7/20/9f56b72131ea71c9566f0cc303a9e92156767845164c0f4ec10534630991/albucore-0.0.17-py3-none-any.whl (10 kB) Collecting eval-type-backport (from albumentations<2.0.0,>=1.4.4->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/ac/ac/aa3d8e0acbcd71140420bc752d7c9779cf3a2a3bb1d7ef30944e38b2cd39/eval_type_backport-0.2.0-py3-none-any.whl (5.9 kB) Collecting opencv-python-headless>=4.9.0.80 (from albumentations<2.0.0,>=1.4.4->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/26/d0/22f68eb23eea053a31655960f133c0be9726c6a881547e6e9e7e2a946c4f/opencv_python_headless-4.10.0.84-cp37-abi3-win_amd64.whl (38.8 MB) Collecting urllib3!=2.2.0,<3,>=1.25.4 (from botocore<1.36.0,>=1.35.24->boto3>=1.28.43->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/ce/d9/5f4c13cecde62396b0d3fe530a50ccea91e7dfc1ccf0e09c228841bb5ba8/urllib3-2.2.3-py3-none-any.whl (126 kB) Collecting cffi>=1.12 (from cryptography>=36.0.0->pdfminer.six==20231228->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/d1/b6/0b0f5ab93b0df4acc49cae758c81fe4e5ef26c3ae2e10cc69249dfd8b3ab/cffi-1.17.1-cp310-cp310-win_amd64.whl (181 kB) Collecting datasets>=2.0.0 (from evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/a5/52/45dab187f03d48c765b94db0464f5c10431756e47ae4cc6a8029a7d57a36/datasets-3.0.0-py3-none-any.whl (474 kB) Collecting dill (from evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/c9/7a/cef76fd8438a42f96db64ddaa85280485a9c395e7df3db8158cfec1eee34/dill-0.3.8-py3-none-any.whl (116 kB) Collecting xxhash (from evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/7b/d7/aa0b22c4ebb7c3ccb993d4c565132abc641cd11164f8952d89eb6a501909/xxhash-3.5.0-cp310-cp310-win_amd64.whl (30 kB) Collecting multiprocess (from evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/bc/f7/7ec7fddc92e50714ea3745631f79bd9c96424cb2702632521028e57d3a36/multiprocess-0.70.16-py310-none-any.whl (134 kB) Collecting fsspec>=2021.05.0 (from fsspec[http]>=2021.05.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/1d/a0/6aaea0c2fbea2f89bfd5db25fb1e3481896a423002ebe4e55288907a97a3/fsspec-2024.9.0-py3-none-any.whl (179 kB) Collecting pybind11>=2.2 (from fasttext-wheel>=0.9.2->fast-langdetect==0.2.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/13/2f/0f24b288e2ce56f51c920137620b4434a38fd80583dbbe24fc2a1656c388/pybind11-2.13.6-py3-none-any.whl (243 kB) Requirement already satisfied: setuptools>=0.7.0 in d:\lib\dev\miniconda3\bin\envs\mineru\lib\site-packages (from fasttext-wheel>=0.9.2->fast-langdetect==0.2.0->magic-pdf[full]) (75.1.0) Collecting six (from fire>=0.3.0->paddleocr==2.7.3->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/d9/5a/e7c31adbe875f2abbb91bd84cf2dc52d792b5a01506781dbcf25c91daf11/six-1.16.0-py2.py3-none-any.whl (11 kB) Collecting wcwidth<0.3.0,>=0.2.12 (from ftfy<7.0.0,>=6.2.0->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/fd/84/fd2ba7aafacbad3c4201d395674fc6348826569da3c0937e75505ead3528/wcwidth-0.2.13-py2.py3-none-any.whl (34 kB) Collecting antlr4-python3-runtime==4.9. (from hydra-core>=1.1->detectron2->magic-pdf[full]) Using cached antlr4_python3_runtime-4.9.3-py3-none-any.whl Collecting portalocker (from iopath<0.2.0,>=0.1.9->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/9b/fb/a70a4214956182e0d7a9099ab17d50bfcba1056188e9b14f35b9e2b62a0d/portalocker-2.10.1-py3-none-any.whl (18 kB) Collecting pytz>=2020.1 (from pandas>=1.1.4->ultralytics->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/11/c3/005fcca25ce078d2cc29fd559379817424e94885510568bc1bc53d7d5846/pytz-2024.2-py2.py3-none-any.whl (508 kB) Collecting tzdata>=2022.7 (from pandas>=1.1.4->ultralytics->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/65/58/f9c9e6be752e9fcb8b6a0ee9fb87e6e7a1f6bcab2cdc73f02bb7ba91ada0/tzdata-2024.1-py2.py3-none-any.whl (345 kB) Collecting idna<4,>=2.5 (from requests->transformers->struct-eqtable==0.1.0->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/76/c6/c88e154df9c4e1a2a66ccf0005a88dfb2650c1dffb6f5ce603dfbd452ce3/idna-3.10-py3-none-any.whl (70 kB) Collecting certifi>=2017.4.17 (from requests->transformers->struct-eqtable==0.1.0->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/12/90/3c9ff0512038035f59d279fddeb79f5f1eccd8859f06d6163c58798b9487/certifi-2024.8.30-py3-none-any.whl (167 kB) Collecting colorlog (from robust-downloader>=0.0.2->fast-langdetect==0.2.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/f3/18/3e867ab37a24fdf073c1617b9c7830e06ec270b1ea4694a624038fc40a03/colorlog-6.8.2-py3-none-any.whl (11 kB) Collecting networkx>=2.8 (from scikit-image->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/38/e9/5f72929373e1a0e8d142a130f3f97e6ff920070f87f91c4e13e40e0fba5a/networkx-3.3-py3-none-any.whl (1.7 MB) Collecting imageio>=2.33 (from scikit-image->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/1e/b7/02adac4e42a691008b5cfb31db98c190e1fc348d1521b9be4429f9454ed1/imageio-2.35.1-py3-none-any.whl (315 kB) Collecting tifffile>=2022.8.12 (from scikit-image->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/50/0a/435d5d7ec64d1c8b422ac9ebe42d2f3b2ac0b3f8a56f5c04dd0f3b7ba83c/tifffile-2024.9.20-py3-none-any.whl (228 kB) Collecting lazy-loader>=0.4 (from scikit-image->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/83/60/d497a310bde3f01cb805196ac61b7ad6dc5dcf8dce66634dc34364b20b4f/lazy_loader-0.4-py3-none-any.whl (12 kB) Collecting sympy (from torch->struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/99/ff/c87e0622b1dadea79d2fb0b25ade9ed98954c9033722eb707053d310d4f3/sympy-1.13.3-py3-none-any.whl (6.2 MB) Collecting jinja2 (from torch->struct-eqtable==0.1.0->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/31/80/3a54838c3fb461f6fec263ebf3a3a41771bd05190238de3486aae8540c36/jinja2-3.1.4-py3-none-any.whl (133 kB) Collecting mkl<=2021.4.0,>=2021.1.1 (from torch->struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/fe/1c/5f6dbf18e8b73e0a5472466f0ea8d48ce9efae39bd2ff38cebf8dce61259/mkl-2021.4.0-py2.py3-none-win_amd64.whl (228.5 MB) Collecting braceexpand (from webdataset<0.3.0,>=0.2.86->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/fa/93/e8c04e80e82391a6e51f218ca49720f64236bc824e92152a2633b74cf7ab/braceexpand-0.1.7-py2.py3-none-any.whl (5.9 kB) Collecting soupsieve>1.2 (from beautifulsoup4->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/d1/c2/fe97d779f3ef3b15f05c94a2f1e3d21732574ed441687474db9d342a7315/soupsieve-2.6-py3-none-any.whl (36 kB) Collecting mypy-extensions>=0.4.3 (from black->detectron2->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/2a/e2/5d3f6ada4297caebe1a2add3b126fe800c96f56dbe5d1988a2cbe0b267aa/mypy_extensions-1.0.0-py3-none-any.whl (4.7 kB) Collecting pathspec>=0.9.0 (from black->detectron2->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/cc/20/ff623b09d963f88bfde16306a54e12ee5ea43e9b597108672ff3a408aad6/pathspec-0.12.1-py3-none-any.whl (31 kB) Collecting platformdirs>=2 (from black->detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/3c/a6/bc1012356d8ece4d66dd75c4b9fc6c1f6650ddd5991e421177d9f8f671be/platformdirs-4.3.6-py3-none-any.whl (18 kB) Collecting tomli>=1.1.0 (from black->detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/97/75/10a9ebee3fd790d20926a90a2547f0bf78f371b2f13aa822c759680ca7b9/tomli-2.0.1-py3-none-any.whl (12 kB) Collecting anyio (from httpx->paddlepaddle==2.6.1->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/9e/ef/7a4f225581a0d7886ea28359179cb861d7fbcdefad29663fc1167b86f69f/anyio-4.6.0-py3-none-any.whl (89 kB) Collecting httpcore==1. (from httpx->paddlepaddle==2.6.1->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/78/d4/e5d7e4f2174f8a4d63c8897d79eb8fe2503f7ecc03282fee1fa2719c2704/httpcore-1.0.5-py3-none-any.whl (77 kB) Collecting sniffio (from httpx->paddlepaddle==2.6.1->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl (10 kB) Collecting h11<0.15,>=0.13 (from httpcore==1.->httpx->paddlepaddle==2.6.1->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/95/04/ff642e65ad6b90db43e668d70ffb6736436c7ce41fcc549f4e9472234127/h11-0.14.0-py3-none-any.whl (58 kB) Collecting et-xmlfile (from openpyxl->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/96/c2/3dd434b0108730014f1b96fd286040dc3bcb70066346f7e01ec2ac95865f/et_xmlfile-1.1.0-py3-none-any.whl (4.7 kB) Collecting cssselect (from premailer->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/06/a9/2da08717a6862c48f1d61ef957a7bba171e7eefa6c0aa0ceb96a140c2a6b/cssselect-1.2.0-py2.py3-none-any.whl (18 kB) Collecting cssutils (from premailer->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/a7/ec/bb273b7208c606890dc36540fe667d06ce840a6f62f9fae7e658fcdc90fb/cssutils-2.11.1-py3-none-any.whl (385 kB) Collecting cachetools (from premailer->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/a4/07/14f8ad37f2d12a5ce41206c21820d8cb6561b728e51fad4530dff0552a67/cachetools-5.5.0-py3-none-any.whl (9.5 kB) Collecting absl-py>=0.4 (from tensorboard->detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/a2/ad/e0d3c824784ff121c03cc031f944bc7e139a8f1870ffd2845cc2dd76f6c4/absl_py-2.1.0-py3-none-any.whl (133 kB) Collecting grpcio>=1.48.2 (from tensorboard->detectron2->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/f3/72/6046088fa273d2c4fe72009d2411d5ccd053017014b1197c4881ead3ee70/grpcio-1.66.1-cp310-cp310-win_amd64.whl (4.3 MB) Collecting markdown>=2.6.8 (from tensorboard->detectron2->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/3f/08/83871f3c50fc983b88547c196d11cf8c3340e37c32d2e9d6152abe2c61f7/Markdown-3.7-py3-none-any.whl (106 kB) Collecting tensorboard-data-server<0.8.0,>=0.7.0 (from tensorboard->detectron2->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/7a/13/e503968fefabd4c6b2650af21e110aa8466fe21432cd7c43a84577a89438/tensorboard_data_server-0.7.2-py3-none-any.whl (2.4 kB) Collecting werkzeug>=1.0.1 (from tensorboard->detectron2->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/4b/84/997bbf7c2bf2dc3f09565c6d0b4959fefe5355c18c4096cfd26d83e0785b/werkzeug-3.0.4-py3-none-any.whl (227 kB) Collecting bce-python-sdk (from visualdl->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/c2/c3/b2696793e89b035ea5f01a0616fb824c8e3a6d69f76f264835a994994897/bce_python_sdk-0.9.22-py3-none-any.whl (336 kB) Collecting flask>=1.1.1 (from visualdl->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/61/80/ffe1da13ad9300f87c93af113edd0638c75138c42a0994becfacac078c06/flask-3.0.3-py3-none-any.whl (101 kB) Collecting Flask-Babel>=3.0.0 (from visualdl->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/14/c2/e0ab5abe37882e118482884f2ec660cd06da644ddfbceccf5f88f546b574/flask_babel-4.0.0-py3-none-any.whl (9.6 kB) Collecting rarfile (from visualdl->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/62/fc/ab37559419ca36dd8dd317c3a98395ed4dcee2beeb28bf6059b972906727/rarfile-4.2-py3-none-any.whl (29 kB) Collecting pycparser (from cffi>=1.12->cryptography>=36.0.0->pdfminer.six==20231228->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/13/a3/a812df4e2dd5696d1f351d58b8fe16a405b234ad2886a0dab9183fb78109/pycparser-2.22-py3-none-any.whl (117 kB) Collecting pyarrow>=15.0.0 (from datasets>=2.0.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/19/09/b0a02908180a25d57312ab5919069c39fddf30602568980419f4b02393f6/pyarrow-17.0.0-cp310-cp310-win_amd64.whl (25.1 MB) Collecting fsspec>=2021.05.0 (from fsspec[http]>=2021.05.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/5e/44/73bea497ac69bafde2ee4269292fa3b41f1198f4bb7bbaaabde30ad29d4a/fsspec-2024.6.1-py3-none-any.whl (177 kB) Collecting aiohttp (from datasets>=2.0.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/f3/45/145d8b4853fc92c0c8509277642767e7726a085e390ce04353dc68b0f5b5/aiohttp-3.10.5-cp310-cp310-win_amd64.whl (379 kB) Collecting itsdangerous>=2.1.2 (from flask>=1.1.1->visualdl->paddleocr==2.7.3->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/04/96/92447566d16df59b2a776c0fb82dbc4d9e07cd95062562af01e408583fc4/itsdangerous-2.2.0-py3-none-any.whl (16 kB) Collecting blinker>=1.6.2 (from flask>=1.1.1->visualdl->paddleocr==2.7.3->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/bb/2a/10164ed1f31196a2f7f3799368a821765c62851ead0e630ab52b8e14b4d0/blinker-1.8.2-py3-none-any.whl (9.5 kB) Collecting Babel>=2.12 (from Flask-Babel>=3.0.0->visualdl->paddleocr==2.7.3->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/ed/20/bc79bc575ba2e2a7f70e8a1155618bb1301eaa5132a8271373a6903f73f8/babel-2.16.0-py3-none-any.whl (9.6 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.6/9.6 MB 20.5 MB/s eta 0:00:00 Collecting MarkupSafe>=2.0 (from jinja2->torch->struct-eqtable==0.1.0->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/69/48/acbf292615c65f0604a0c6fc402ce6d8c991276e16c80c46a8f758fbd30c/MarkupSafe-2.1.5-cp310-cp310-win_amd64.whl (17 kB) Collecting intel-openmp==2021. (from mkl<=2021.4.0,>=2021.1.1->torch->struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/6f/21/b590c0cc3888b24f2ac9898c41d852d7454a1695fbad34bee85dba6dc408/intel_openmp-2021.4.0-py2.py3-none-win_amd64.whl (3.5 MB) Collecting tbb==2021.* (from mkl<=2021.4.0,>=2021.1.1->torch->struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/9b/24/84ce997e8ae6296168a74d0d9c4dde572d90fb23fd7c0b219c30ff71e00e/tbb-2021.13.1-py3-none-win_amd64.whl (286 kB) Collecting exceptiongroup>=1.0.2 (from anyio->httpx->paddlepaddle==2.6.1->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/02/cc/b7e31358aac6ed1ef2bb790a9746ac2c69bcb3c8588b41616914eb106eaf/exceptiongroup-1.2.2-py3-none-any.whl (16 kB) Collecting pycryptodome>=3.8.0 (from bce-python-sdk->visualdl->paddleocr==2.7.3->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/1f/90/d131c0eb643290230dfa4108b7c2d135122d88b714ad241d77beb4782a76/pycryptodome-3.20.0-cp35-abi3-win_amd64.whl (1.8 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 24.2 MB/s eta 0:00:00 Collecting future>=0.6.0 (from bce-python-sdk->visualdl->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/da/71/ae30dadffc90b9006d77af76b393cb9dfbfc9629f339fc1574a1c52e6806/future-1.0.0-py3-none-any.whl (491 kB) Collecting more-itertools (from cssutils->premailer->paddleocr==2.7.3->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/48/7e/3a64597054a70f7c86eb0a7d4fc315b8c1ab932f64883a297bdffeb5f967/more_itertools-10.5.0-py3-none-any.whl (60 kB) Collecting pywin32>=226 (from portalocker->iopath<0.2.0,>=0.1.9->unimernet==0.1.6->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/d3/d6/891894edec688e72c2e308b3243fad98b4066e1839fd2fe78f04129a9d31/pywin32-306-cp310-cp310-win_amd64.whl (9.2 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.2/9.2 MB 20.5 MB/s eta 0:00:00 Collecting mpmath<1.4,>=1.1.0 (from sympy->torch->struct-eqtable==0.1.0->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl (536 kB) Collecting aiohappyeyeballs>=2.3.0 (from aiohttp->datasets>=2.0.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/18/b6/58ea188899950d759a837f9a58b2aee1d1a380ea4d6211ce9b1823748851/aiohappyeyeballs-2.4.0-py3-none-any.whl (12 kB) Collecting aiosignal>=1.1.2 (from aiohttp->datasets>=2.0.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/76/ac/a7305707cb852b7e16ff80eaf5692309bde30e2b1100a1fcacdc8f731d97/aiosignal-1.3.1-py3-none-any.whl (7.6 kB) Collecting attrs>=17.3.0 (from aiohttp->datasets>=2.0.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/6a/21/5b6702a7f963e95456c0de2d495f67bf5fd62840ac655dc451586d23d39a/attrs-24.2.0-py3-none-any.whl (63 kB) Collecting frozenlist>=1.1.1 (from aiohttp->datasets>=2.0.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/61/15/2b5d644d81282f00b61e54f7b00a96f9c40224107282efe4cd9d2bf1433a/frozenlist-1.4.1-cp310-cp310-win_amd64.whl (50 kB) Collecting multidict<7.0,>=4.5 (from aiohttp->datasets>=2.0.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/04/5a/d88cd5d00a184e1ddffc82aa2e6e915164a6d2641ed3606e766b5d2f275a/multidict-6.1.0-cp310-cp310-win_amd64.whl (28 kB) Collecting yarl<2.0,>=1.0 (from aiohttp->datasets>=2.0.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Downloading https://mirrors.aliyun.com/pypi/packages/b0/29/2a08a45b9f2eddd1b840813698ee655256f43b507c12f7f86df947cf5f8f/yarl-1.11.1-cp310-cp310-win_amd64.whl (110 kB) Collecting async-timeout<5.0,>=4.0 (from aiohttp->datasets>=2.0.0->evaluate<0.5.0,>=0.4.1->unimernet==0.1.6->magic-pdf[full]) Using cached https://mirrors.aliyun.com/pypi/packages/a7/fa/e01228c2938de91d47b307831c62ab9e4001e747789d0b05baf779a6488c/async_timeout-4.0.3-py3-none-any.whl (5.7 kB) Installing collected packages: wordninja, wcwidth, wand, tbb, pywin32, pytz, pyclipper, py-cpuinfo, mpmath, lmdb, intel-openmp, Brotli, braceexpand, antlr4-python3-runtime, xxhash, win32-setctime, urllib3, tzdata, typing-extensions, tomli, threadpoolctl, termcolor, tensorboard-data-server, tabulate, sympy, soupsieve, sniffio, six, safetensors, regex, rarfile, rapidfuzz, pyyaml, pyparsing, pypandoc, PyMuPDFb, pycryptodome, pycparser, pybind11, psutil, protobuf, portalocker, platformdirs, Pillow, pathspec, packaging, numpy, networkx, mypy-extensions, more-itertools, mkl, MarkupSafe, markdown, lxml, kiwisolver, joblib, jmespath, itsdangerous, idna, h11, grpcio, future, ftfy, fsspec, frozenlist, fonttools, filelock, exceptiongroup, eval-type-backport, et-xmlfile, dill, decorator, cython, cycler, cssselect, colorama, cloudpickle, charset-normalizer, certifi, cachetools, blinker, Babel, attrs, async-timeout, astor, annotated-types, aiohappyeyeballs, absl-py, yacs, werkzeug, webdataset, tqdm, tifffile, shapely, scipy, requests, python-docx, python-dateutil, PyMuPDF, pydantic-core, pyarrow, opt-einsum, openpyxl, opencv-python-headless, opencv-python, opencv-contrib-python, omegaconf, multiprocess, multidict, loguru, lazy-loader, langdetect, jinja2, imageio, httpcore, fire, fasttext-wheel, eva-decord, cssutils, contourpy, colorlog, click, cffi, beautifulsoup4, bce-python-sdk, attrdict, anyio, aiosignal, yarl, torch, tensorboard, scikit-learn, scikit-image, robust-downloader, pydantic, premailer, pdf2docx, pandas, matplotlib, iopath, hydra-core, huggingface-hub, httpx, flask, cryptography, botocore, black, albucore, ultralytics-thop, torchvision, torchtext, tokenizers, seaborn, s3transfer, pycocotools, pdfminer.six, paddlepaddle, imgaug, fvcore, Flask-Babel, fast-langdetect, fairscale, albumentations, aiohttp, visualdl, ultralytics, transformers, timm, detectron2, boto3, struct-eqtable, paddleocr, magic-pdf, datasets, evaluate, unimernet Successfully installed Babel-2.16.0 Brotli-1.1.0 Flask-Babel-4.0.0 MarkupSafe-2.1.5 Pillow-10.4.0 PyMuPDF-1.24.10 PyMuPDFb-1.24.10 absl-py-2.1.0 aiohappyeyeballs-2.4.0 aiohttp-3.10.5 aiosignal-1.3.1 albucore-0.0.17 albumentations-1.4.15 annotated-types-0.7.0 antlr4-python3-runtime-4.9.3 anyio-4.6.0 astor-0.8.1 async-timeout-4.0.3 attrdict-2.0.1 attrs-24.2.0 bce-python-sdk-0.9.22 beautifulsoup4-4.12.3 black-24.8.0 blinker-1.8.2 boto3-1.35.24 botocore-1.35.24 braceexpand-0.1.7 cachetools-5.5.0 certifi-2024.8.30 cffi-1.17.1 charset-normalizer-3.3.2 click-8.1.7 cloudpickle-3.0.0 colorama-0.4.6 colorlog-6.8.2 contourpy-1.3.0 cryptography-43.0.1 cssselect-1.2.0 cssutils-2.11.1 cycler-0.12.1 cython-3.0.11 datasets-3.0.0 decorator-5.1.1 detectron2-0.6 dill-0.3.8 et-xmlfile-1.1.0 eva-decord-0.6.1 eval-type-backport-0.2.0 evaluate-0.4.3 exceptiongroup-1.2.2 fairscale-0.4.13 fast-langdetect-0.2.0 fasttext-wheel-0.9.2 filelock-3.16.1 fire-0.6.0 flask-3.0.3 fonttools-4.53.1 frozenlist-1.4.1 fsspec-2024.6.1 ftfy-6.2.3 future-1.0.0 fvcore-0.1.5.post20221221 grpcio-1.66.1 h11-0.14.0 httpcore-1.0.5 httpx-0.27.2 huggingface-hub-0.25.0 hydra-core-1.3.2 idna-3.10 imageio-2.35.1 imgaug-0.4.0 intel-openmp-2021.4.0 iopath-0.1.9 itsdangerous-2.2.0 jinja2-3.1.4 jmespath-1.0.1 joblib-1.4.2 kiwisolver-1.4.7 langdetect-1.0.9 lazy-loader-0.4 lmdb-1.5.1 loguru-0.7.2 lxml-5.3.0 magic-pdf-0.8.1 markdown-3.7 matplotlib-3.9.0 mkl-2021.4.0 more-itertools-10.5.0 mpmath-1.3.0 multidict-6.1.0 multiprocess-0.70.16 mypy-extensions-1.0.0 networkx-3.3 numpy-1.26.4 omegaconf-2.3.0 opencv-contrib-python-4.6.0.66 opencv-python-4.6.0.66 opencv-python-headless-4.10.0.84 openpyxl-3.1.5 opt-einsum-3.3.0 packaging-24.1 paddleocr-2.7.3 paddlepaddle-2.6.1 pandas-2.2.3 pathspec-0.12.1 pdf2docx-0.5.8 pdfminer.six-20231228 platformdirs-4.3.6 portalocker-2.10.1 premailer-3.10.0 protobuf-3.20.2 psutil-6.0.0 py-cpuinfo-9.0.0 pyarrow-17.0.0 pybind11-2.13.6 pyclipper-1.3.0.post5 pycocotools-2.0.8 pycparser-2.22 pycryptodome-3.20.0 pydantic-2.7.4 pydantic-core-2.18.4 pypandoc-1.13 pyparsing-3.1.4 python-dateutil-2.9.0.post0 python-docx-1.1.2 pytz-2024.2 pywin32-306 pyyaml-6.0.2 rapidfuzz-3.9.7 rarfile-4.2 regex-2024.9.11 requests-2.32.3 robust-downloader-0.0.2 s3transfer-0.10.2 safetensors-0.4.5 scikit-image-0.24.0 scikit-learn-1.5.2 scipy-1.14.1 seaborn-0.13.2 shapely-2.0.6 six-1.16.0 sniffio-1.3.1 soupsieve-2.6 struct-eqtable-0.1.0 sympy-1.13.3 tabulate-0.9.0 tbb-2021.13.1 tensorboard-2.17.1 tensorboard-data-server-0.7.2 termcolor-2.4.0 threadpoolctl-3.5.0 tifffile-2024.9.20 timm-0.9.16 tokenizers-0.19.1 tomli-2.0.1 torch-2.3.1 torchtext-0.18.0 torchvision-0.18.1 tqdm-4.66.5 transformers-4.40.0 typing-extensions-4.12.2 tzdata-2024.1 ultralytics-8.2.98 ultralytics-thop-2.0.6 unimernet-0.1.6 urllib3-2.2.3 visualdl-2.5.3 wand-0.6.13 wcwidth-0.2.13 webdataset-0.2.100 werkzeug-3.0.4 win32-setctime-1.1.0 wordninja-2.0.0 xxhash-3.5.0 yacs-0.1.8 yarl-1.11.1 (MinerU) PS C:\Users\James> cd D:\Lib\Dev\AI\MinerU\MinerU (MinerU) PS D:\Lib\Dev\AI\MinerU\MinerU> cd .\demo\ (MinerU) PS D:\Lib\Dev\AI\MinerU\MinerU\demo> python.exe .\magic_pdf_parse_main.py 2024-09-22 09:30:25.739 | INFO | magic_pdf.libs.pdf_check:detect_invalid_chars:57 - cid_count: 0, text_len: 10, cid_chars_radio: 0.0 2024-09-22 09:30:25.754 | WARNING | magic_pdf.filter.pdf_classify_by_type:classify:334 - pdf is not classified by area and text_len, by_image_area: False, by_text: False, by_avg_words: False, by_img_num: True, by_text_layout: False, by_img_narrow_strips: True, by_invalid_chars: True 2024-09-22 09:30:31.526 | ERROR | magic_pdf.model.pdf_extract_kit::28 - cannot import name 'preserve_channel_dim' from 'albucore.utils' (D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\albucore\utils.py) Traceback (most recent call last):

File "D:\Lib\Dev\AI\MinerU\MinerU\demo\magic_pdf_parse_main.py", line 136, in pdf_parse_main(pdf_path) │ └ 'D:\Lib\Dev\AI\MinerU\上海202405-测不出\2021年上海x审计报告.pdf' └ <function pdf_parse_main at 0x000002179F3A5000>

File "D:\Lib\Dev\AI\MinerU\MinerU\demo\magic_pdf_parse_main.py", line 112, in pdf_parse_main pipe.pipe_analyze() # 解析 │ └ <function UNIPipe.pipe_analyze at 0x000002179F3A4D30> └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x000002179F398A30>

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\magic_pdf\pipe\UNIPipe.py", line 33, in pipe_analyze self.model_list = doc_analyze(self.pdf_bytes, ocr=True, │ │ │ │ └ b'%PDF-1.7\n4 0 obj\n<<\n/Type /Page\n/Resources\n<<\n/XObject\n<< /PAGE0001 7 0 R >>\n/ProcSet 6 0 R\n>>\n/MediaBox [ 0 0 59... │ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x000002179F398A30> │ │ └ <function doc_analyze at 0x00000217E9D88790> │ └ [] └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x000002179F398A30>

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 110, in doc_analyze custom_model = model_manager.get_model(ocr, show_log) │ │ │ └ False │ │ └ True │ └ <function ModelSingleton.get_model at 0x00000217E9D88700> └ <magic_pdf.model.doc_analyze_by_custom_model.ModelSingleton object at 0x000002179F398AC0>

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 63, in get_model self._models[key] = custom_model_init(ocr=ocr, show_log=show_log) │ │ │ │ │ └ False │ │ │ │ └ True │ │ │ └ <function custom_model_init at 0x00000217E9D885E0> │ │ └ (True, False) │ └ {} └ <magic_pdf.model.doc_analyze_by_custom_model.ModelSingleton object at 0x000002179F398AC0>

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\magic_pdf\model\doc_analyze_by_custom_model.py", line 83, in custom_model_init from magic_pdf.model.pdf_extract_kit import CustomPEKModel

File "", line 1027, in _find_and_load File "", line 1006, in _find_and_load_unlocked File "", line 688, in _load_unlocked File "", line 883, in exec_module File "", line 241, in _call_with_frames_removed

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\magic_pdf\model\pdf_extract_kit.py", line 23, in from unimernet.common.config import Config

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\unimernet__init__.py", line 15, in from unimernet.datasets.builders import *

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\unimernet\datasets\builders__init__.py", line 8, in from unimernet.datasets.builders.base_dataset_builder import load_dataset_config

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\unimernet\datasets\builders\base_dataset_builder.py", line 17, in from unimernet.processors.base_processor import BaseProcessor

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\unimernet\processors__init__.py", line 18, in from unimernet.processors.formula_processor import (

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\unimernet\processors\formula_processor.py", line 3, in import albumentations as alb

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\albumentations__init__.py", line 6, in from .augmentations import *

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\albumentations\augmentations__init__.py", line 1, in from .blur.functional import *

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\albumentations\augmentations\blur__init__.py", line 1, in from .functional import *

File "D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\albumentations\augmentations\blur\functional.py", line 9, in from albucore.utils import clipped, maybe_process_in_chunks, preserve_channel_dim │ └ <function maybe_process_in_chunks at 0x00000217BAB024D0> └ <function clipped at 0x00000217BAB025F0>

ImportError: cannot import name 'preserve_channel_dim' from 'albucore.utils' (D:\Lib\Dev\miniconda3\Bin\envs\MinerU\lib\site-packages\albucore\utils.py) 2024-09-22 09:30:32.122 | ERROR | magic_pdf.model.pdf_extract_kit::29 - Required dependency not installed, please install by "pip install magic-pdf[full] --extra-index-url https://myhloli.github.io/wheels/" (MinerU) PS D:\Lib\Dev\AI\MinerU\MinerU\demo>

How to reproduce the bug | 如何复现

PS C:\Users\James> conda activate MinerU (MinerU) PS C:\Users\James> pip install -U magic-pdf[full] --extra-index-url https://wheels.myhloli.com (MinerU) PS C:\Users\James> cd D:\Lib\Dev\AI\MinerU\MinerU (MinerU) PS D:\Lib\Dev\AI\MinerU\MinerU> cd .\demo\ (MinerU) PS D:\Lib\Dev\AI\MinerU\MinerU\demo> magic-pdf -v magic-pdf, version 0.8.1 (MinerU) PS D:\Lib\Dev\AI\MinerU\MinerU\demo> python.exe .\magic_pdf_parse_main.py

image

Operating system | 操作系统

Windows

Python version | Python 版本

3.10

Software version | 软件版本 (magic-pdf --version)

0.8.x

Device mode | 设备模式

cpu

xtc111 commented 2 hours ago

the same bug

myhloli commented 1 hour ago

@tqangxl @xtc111

https://github.com/opendatalab/MinerU/issues/640

xtc111 commented 41 minutes ago

@tqangxl @xtc111

640

thanks