VikParuchuri / marker

Convert PDF to markdown quickly with high accuracy
https://www.datalab.to
GNU General Public License v3.0
13.97k stars 707 forks source link

Convert single failed in Finding reading order 0% process #212

Closed worstkid92 closed 5 days ago

worstkid92 commented 5 days ago

Env:local GPU,cuda,python3.11 Driver:NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 Using default config except modify RAM to 16GB Traceback:

Loaded texify model to cuda with torch.float16 dtype
Detecting bboxes: 100%|█████████████████████████████████████████████████████████████████| 2/2 [00:03<00:00,  1.80s/it]
Recognizing Text: 100%|█████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.17s/it]
Detecting bboxes: 100%|█████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00,  1.59s/it]
Finding reading order:   0%|                                                                    | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/anaconda3_install/envs/unsloth_env/bin/marker_single", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/convert_single.py", line 28, in main
    full_text, images, out_meta = convert_single_pdf(fname, model_lst, max_pages=args.max_pages, langs=langs, batch_multiplier=args.batch_multiplier, start_page=args.start_page)
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/marker/convert.py", line 113, in convert_single_pdf
    surya_order(doc, pages, order_model, batch_multiplier=batch_multiplier)
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/marker/layout/order.py", line 33, in surya_order
    order_results = batch_ordering(images, bboxes, order_model, processor, batch_size=int(get_batch_size() * batch_multiplier))
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/surya/ordering.py", line 68, in batch_ordering
    return_dict = model(
                  ^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/surya/model/ordering/encoderdecoder.py", line 38, in forward
    encoder_outputs = self.encoder(
                      ^^^^^^^^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/transformers/models/donut/modeling_donut_swin.py", line 965, in forward
    embedding_output, input_dimensions = self.embeddings(
                                         ^^^^^^^^^^^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/anaconda3_install/envs/unsloth_env/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: VariableDonutSwinEmbeddings.forward() got an unexpected keyword argument 'interpolate_pos_encoding'

Anyone help?

VikParuchuri commented 5 days ago

Fixed by https://github.com/VikParuchuri/marker/pull/213 . Will merge soon. Issue is a bug with the new transformers version.

worstkid92 commented 5 days ago

Detecting bboxes: 100%|█████████████████████████████████████████████████████████████████| 2/2 [00:03<00:00, 1.80s/it] image

Sorry,but I pulled the latest code,I see the #213 has been merged to master,but the error still exists.

VikParuchuri commented 5 days ago

You need to reinstall the dependencies, too, since it needs the new surya version