2024-11-19 11:00:29.883 | ERROR | magic_pdf.user_api:parse_pdf:97 - The expanded size of the tensor (567) must match the existing size (514) at non-singleton dimension 1. Target sizes: [1, 567]. Tensor sizes: [1, 514]
Traceback (most recent call last):
File "/root/MinerU/run.py", line 193, in
pdf_parse_main(pdf_path)
│ └ '/root/PDF/error/02B20231201C_l.pdf'
└ <function pdf_parse_main at 0x7f55d36d48b0>
File "/root/MinerU/run.py", line 137, in pdf_parse_main
pipe.pipe_parse()
│ └ <function UNIPipe.pipe_parse at 0x7f57681540d0>
└ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7f563e17a320>
File "/root/MinerU/magic_pdf/user_api.py", line 100, in parse_union_pdf
pdf_info_dict = parse_pdf(parse_pdf_by_txt)
│ └ <function parse_pdf_by_txt at 0x7f576813bd00>
└ <function parse_union_pdf..parse_pdf at 0x7f55c4722dd0>
File "/root/MinerU/magic_pdf/user_api.py", line 88, in parse_pdf
return method(
└ <function parse_pdf_by_txt at 0x7f576813bd00>
File "/root/MinerU/magic_pdf/pdf_parse_by_txt.py", line 15, in parse_pdf_by_txt
return pdf_parse_union(dataset,
│ └ <magic_pdf.data.dataset.PymuDocDataset object at 0x7f55bd186170>
└ <function pdf_parse_union at 0x7f576813bc70>
File "/root/MinerU/magic_pdf/pdf_parse_union_core_v2.py", line 617, in pdf_parse_union
page_info = parse_page_core(
└ <function parse_page_core at 0x7f576813bbe0>
Description of the bug | 错误描述
2024-11-19 11:00:29.883 | ERROR | magic_pdf.user_api:parse_pdf:97 - The expanded size of the tensor (567) must match the existing size (514) at non-singleton dimension 1. Target sizes: [1, 567]. Tensor sizes: [1, 514] Traceback (most recent call last):
File "/root/MinerU/run.py", line 193, in
pdf_parse_main(pdf_path)
│ └ '/root/PDF/error/02B20231201C_l.pdf'
└ <function pdf_parse_main at 0x7f55d36d48b0>
File "/root/MinerU/run.py", line 137, in pdf_parse_main pipe.pipe_parse() │ └ <function UNIPipe.pipe_parse at 0x7f57681540d0> └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7f563e17a320>
File "/root/MinerU/magic_pdf/pipe/UNIPipe.py", line 44, in pipe_parse self.pdf_mid_data = parse_union_pdf(self.pdf_bytes, self.model_list, self.image_writer, │ │ │ │ │ │ │ │ └ <magic_pdf.rw.DiskReaderWriter.DiskReaderWriter object at 0x7f584a677970> │ │ │ │ │ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7f563e17a320> │ │ │ │ │ │ └ [{'layout_dets': [{'category_id': 1, 'poly': [22.12286376953125, 2711.37548828125, 429.85791015625, 2711.37548828125, 429.857... │ │ │ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7f563e17a320> │ │ │ │ └ b'%PDF-1.4\r%\xe2\xe3\xcf\xd3\r\n1 0 obj\r\n<<\r\n/ModDate (D:20231201024525+08\'00\')\r\n/CreationDate (D:20231201024525+08... │ │ │ └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7f563e17a320> │ │ └ <function parse_union_pdf at 0x7f576813be20> │ └ None └ <magic_pdf.pipe.UNIPipe.UNIPipe object at 0x7f563e17a320>
File "/root/MinerU/magic_pdf/user_api.py", line 100, in parse_union_pdf pdf_info_dict = parse_pdf(parse_pdf_by_txt) │ └ <function parse_pdf_by_txt at 0x7f576813bd00> └ <function parse_union_pdf..parse_pdf at 0x7f55c4722dd0>
File "/root/MinerU/magic_pdf/pdf_parse_by_txt.py", line 15, in parse_pdf_by_txt return pdf_parse_union(dataset, │ └ <magic_pdf.data.dataset.PymuDocDataset object at 0x7f55bd186170> └ <function pdf_parse_union at 0x7f576813bc70>
File "/root/MinerU/magic_pdf/pdf_parse_union_core_v2.py", line 617, in pdf_parse_union page_info = parse_page_core( └ <function parse_page_core at 0x7f576813bbe0>
File "/root/MinerU/magic_pdf/pdf_parse_union_core_v2.py", line 542, in parse_page_core sorted_bboxes = sort_lines_by_model(fix_blocks, page_w, page_h, line_height) │ │ │ │ └ 9 │ │ │ └ 1433.249755859375 │ │ └ 1026.0 │ └ [{'type': 'text', 'bbox': [7, 976, 154, 1111], 'lines': [{'bbox': [27.08985710144043, 976.544677734375, 152.90780639648438, 9... └ <function sort_lines_by_model at 0x7f576813b880>
File "/root/MinerU/magic_pdf/pdf_parse_union_core_v2.py", line 305, in sort_lines_by_model orders = do_predict(boxes, model) │ │ └ LayoutLMv3ForTokenClassification( │ │ (layoutlmv3): LayoutLMv3Model( │ │ (embeddings): LayoutLMv3TextEmbeddings( │ │ (word_em... │ └ [[26, 681, 149, 688], [9, 689, 149, 696], [9, 697, 149, 704], [9, 705, 149, 712], [9, 713, 149, 719], [9, 721, 149, 727], [9,... └ <function do_predict at 0x7f576813b5b0>
File "/root/MinerU/magic_pdf/pdf_parse_union_core_v2.py", line 172, in do_predict logits = model(**inputs).logits.cpu().squeeze(0) │ └
└ LayoutLMv3ForTokenClassification(
(layoutlmv3): LayoutLMv3Model(
(embeddings): LayoutLMv3TextEmbeddings(
(word_em...
File "/root/anaconda3/envs/mineru/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) │ │ │ └
│ │ └ ()
│ └ <function Module._call_impl at 0x7f576e03ac20>
└ LayoutLMv3ForTokenClassification(
(layoutlmv3): LayoutLMv3Model(
(embeddings): LayoutLMv3TextEmbeddings(
(word_em...
File "/root/anaconda3/envs/mineru/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, *kwargs)
│ │ └
│ └ ()
└ <bound method LayoutLMv3ForTokenClassification.forward of LayoutLMv3ForTokenClassification(
(layoutlmv3): LayoutLMv3Model(
...
File "/root/anaconda3/envs/mineru/lib/python3.10/site-packages/transformers/models/layoutlmv3/modeling_layoutlmv3.py", line 1099, in forward
outputs = self.layoutlmv3(
└ LayoutLMv3ForTokenClassification(
(layoutlmv3): LayoutLMv3Model(
(embeddings): LayoutLMv3TextEmbeddings(
(word_em...
File "/root/anaconda3/envs/mineru/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl( args, kwargs)
│ │ │ └
│ │ └
│ └ <function Module._call_impl at 0x7f576e03ac20>
└ LayoutLMv3Model(
(embeddings): LayoutLMv3TextEmbeddings(
(word_embeddings): Embedding(50265, 1024, padding_idx=1)
(...
File "/root/anaconda3/envs/mineru/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
│ │ └
│ └
└ <bound method LayoutLMv3Model.forward of LayoutLMv3Model(
(embeddings): LayoutLMv3TextEmbeddings(
(word_embeddings): Em...
File "/root/anaconda3/envs/mineru/lib/python3.10/site-packages/transformers/models/layoutlmv3/modeling_layoutlmv3.py", line 961, in forward
position_ids = position_ids.expand_as(input_ids)
│ │ └
│ └ <method 'expand_as' of 'torch._C.TensorBase' objects>
└
具体的报错如上,并非所有pdf都会出错,目前我也不敢说一定是哪的问题,如果您有空请帮我看一下
How to reproduce the bug | 如何复现
复现过程就是简单的运行magic_pdf_parse_main文件
Operating system | 操作系统
Linux
Python version | Python 版本
3.10
Software version | 软件版本 (magic-pdf --version)
0.9.x
Device mode | 设备模式
cuda