opendatalab / PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction
https://pdf-extract-kit.readthedocs.io/zh-cn/latest/index.html
GNU Affero General Public License v3.0
5.93k stars 388 forks source link

pdf转markdown报错 #183

Open miludedeng opened 4 days ago

miludedeng commented 4 days ago

python project/pdf2markdown/scripts/run_project.py --config project/pdf2markdown/configs/pdf2markdown.yaml

⚠️ GitHub assets check failure for https://api.github.com/repos/doclayout_yolo/assets/releases/tags/v8.1.0: 404 Not Found ⚠️ GitHub assets check failure for https://api.github.com/repos/doclayout_yolo/assets/releases/latest: 404 Not Found Traceback (most recent call last): File "/home/zrway/llm/PDF-Extract-Kit/project/pdf2markdown/scripts/run_project.py", line 41, in main(args.config) File "/home/zrway/llm/PDF-Extract-Kit/project/pdf2markdown/scripts/run_project.py", line 21, in main task_instances = initialize_tasks_and_models(config) File "/home/zrway/llm/PDF-Extract-Kit/project/pdf2markdown/scripts/../../../pdf_extract_kit/utils/config_loader.py", line 42, in initialize_tasks_and_models model_instance = ModelClass(model_config) File "/home/zrway/llm/PDF-Extract-Kit/project/pdf2markdown/scripts/../../../pdf_extract_kit/tasks/layout_detection/models/yolo.py", line 34, in init self.model = YOLOv10(config['model_path']) File "/home/zrway/miniconda3/envs/pdf-extract-kit-1.0/lib/python3.10/site-packages/doclayout_yolo/models/yolov10/model.py", line 10, in init super().init(model=model, task=task, verbose=verbose) File "/home/zrway/miniconda3/envs/pdf-extract-kit-1.0/lib/python3.10/site-packages/doclayout_yolo/engine/model.py", line 144, in init self._load(model, task=task) File "/home/zrway/miniconda3/envs/pdf-extract-kit-1.0/lib/python3.10/site-packages/doclayout_yolo/engine/model.py", line 233, in _load self.model, self.ckpt = attempt_load_one_weight(weights) File "/home/zrway/miniconda3/envs/pdf-extract-kit-1.0/lib/python3.10/site-packages/doclayout_yolo/nn/tasks.py", line 807, in attempt_load_one_weight ckpt, weight = torch_safe_load(weight) # load ckpt File "/home/zrway/miniconda3/envs/pdf-extract-kit-1.0/lib/python3.10/site-packages/doclayout_yolo/nn/tasks.py", line 733, in torch_safe_load ckpt = torch.load(file, map_location="cpu") File "/home/zrway/miniconda3/envs/pdf-extract-kit-1.0/lib/python3.10/site-packages/ultralytics/utils/patches.py", line 86, in torch_load return _torch_load(*args, **kwargs) File "/home/zrway/miniconda3/envs/pdf-extract-kit-1.0/lib/python3.10/site-packages/torch/serialization.py", line 997, in load with _open_file_like(f, 'rb') as opened_file: File "/home/zrway/miniconda3/envs/pdf-extract-kit-1.0/lib/python3.10/site-packages/torch/serialization.py", line 444, in _open_file_like return _open_file(name_or_buffer, mode) File "/home/zrway/miniconda3/envs/pdf-extract-kit-1.0/lib/python3.10/site-packages/torch/serialization.py", line 425, in init super().init(open(name, mode)) FileNotFoundError: [Errno 2] No such file or directory: 'models/Layout/YOLO/doclayout_yolo_ft.pt'

wufan-tb commented 3 days ago

download our layout model and then try again: https://huggingface.co/opendatalab/PDF-Extract-Kit-1.0/blob/main/models/Layout/YOLO/doclayout_yolo_ft.pt