qurator-spk / eynollah

Document Layout Analysis
Apache License 2.0
332 stars 27 forks source link

Eynollah crashes while processing image #77

Closed hasnain095 closed 1 year ago

hasnain095 commented 2 years ago

During processing PDFs, we're facing crashes that stop the Eynollah process. The issue has been identified to be related to a certain type of page, which Eynollah is unable to process and it simply stops working. Below is the image.

page15

I've tested the issue using the following commands:

eynollah -i /home/ubuntu/layout_detection/eynollah_batch_processing/images/000___2014___1d8517527d3613cc180cf7fefe0dac7e__4189328/page15.jpg -m /home/ubuntu/layout_detection/eynollah_batch_processing/models -o /home/ubuntu/layout_detection/eynollah_batch_processing/page_xml/2014/1d8517527d3613cc180cf7fefe0dac7e__4189328 -light -fl

Also using the -di flag

eynollah -di /home/ubuntu/layout_detection/eynollah_batch_processing/images/000___2014___1d8517527d3613cc180cf7fefe0dac7e__4189328 -m /home/ubuntu/layout_detection/eynollah_batch_processing/models -o /home/ubuntu/layout_detection/eynollah_batch_processing/page_xml/2014/1d8517527d3613cc180cf7fefe0dac7e__4189328 -light -fl

The stack trace is below:

[Errno 2] No such file or directory: 'identify': 'identify'
09:56:57.674 INFO eynollah - Resizing and enhancing image...
09:56:57.674 INFO eynollah - Detected 230 DPI
09:57:19.008 INFO eynollah - Found 2 columns ([[1.2393231e-11 1.0000000e+00 1.3864657e-18 1.2094623e-19 1.5120450e-16
  1.8504462e-19]])
09:57:19.023 INFO eynollah - Image was enhanced.
09:57:19.047 INFO eynollah - Enhancing took 21.4s 
09:57:26.472 INFO eynollah - Image dimensions: 224x448
09:57:38.725 INFO eynollah - Image dimensions: 448x672
09:57:50.917 INFO eynollah - Image dimensions: 448x672
09:58:12.733 INFO eynollah - slope_deskew: 0.1212121212121211
09:58:22.525 INFO eynollah - detection of marginals took 0.4s
09:58:30.374 INFO eynollah - Image dimensions: 896x896
09:58:35.116 INFO eynollah - Image dimensions: 896x896
09:58:59.675 ERROR eynollah - index 2 is out of bounds for axis 0 with size 2
Traceback (most recent call last):
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/eynollah.py", line 1949, in do_order_of_regions_full_layout
    order_of_texts_tot.append(int(order_by_con_head[tj1]))
IndexError: index 2 is out of bounds for axis 0 with size 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/layout_detection/venv/bin/eynollah", line 33, in <module>
    sys.exit(load_entry_point('eynollah', 'console_scripts', 'eynollah')())
  File "/home/ubuntu/layout_detection/venv/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/layout_detection/venv/lib/python3.7/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/layout_detection/venv/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu/layout_detection/venv/lib/python3.7/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/cli.py", line 181, in main
    eynollah.run()
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/eynollah.py", line 3119, in run
    order_text_new, id_of_texts_tot = self.do_order_of_regions(contours_only_text_parent, contours_only_text_parent_h, boxes, textline_mask_tot)
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/eynollah.py", line 2339, in do_order_of_regions
    return self.do_order_of_regions_full_layout(*args, **kwargs)
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/eynollah.py", line 2021, in do_order_of_regions_full_layout
    order_of_texts_tot.append(int(order_by_con_head[tj1]))
IndexError: index 2 is out of bounds for axis 0 with size 2
hasnain095 commented 2 years ago

Another example

page9

Command run was:

eynollah -di /home/ubuntu/layout_detection/eynollah_batch_processing/images -m /home/ubuntu/layout_detection/eynollah_batch_processing/models -o /home/ubuntu/layout_detection/eynollah_batch_processing/page_xml -light -fl

The stacktrace is below:

10:55:55.039 INFO eynollah - Resizing and enhancing image...
10:55:55.039 INFO eynollah - Detected 230 DPI
10:55:55.756 INFO eynollah - Found 2 columns ([[3.3476248e-02 9.6652371e-01 2.6331687e-11 3.3226003e-10 2.7837012e-11
  6.8898527e-09]])
10:55:55.769 INFO eynollah - Image was enhanced.
10:55:55.782 INFO eynollah - Enhancing took 0.8s 
10:55:56.022 INFO eynollah - Image dimensions: 224x448
10:55:59.063 INFO eynollah - Image dimensions: 448x672
10:56:01.909 INFO eynollah - Image dimensions: 448x672
10:56:22.325 INFO eynollah - slope_deskew: 0.1212121212121211
10:56:23.540 INFO eynollah - detection of marginals took 0.2s
10:56:24.176 INFO eynollah - Image dimensions: 896x896
10:56:26.183 INFO eynollah - Image dimensions: 896x896
10:56:39.483 ERROR eynollah - index 4 is out of bounds for axis 0 with size 4
Traceback (most recent call last):
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/eynollah.py", line 1939, in do_order_of_regions_full_layout
    order_of_texts, id_of_texts = order_and_id_of_texts(con_inter_box, con_inter_box_h, matrix_of_orders, indexes_sorted, index_by_kind_sorted, kind_of_texts_sorted, ref_point)
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/utils/xml.py", line 79, in order_and_id_of_texts
    interest = indexes_sorted_1[indexes_sorted_1 == index_of_types_1[idx_textregion]]
IndexError: index 4 is out of bounds for axis 0 with size 4

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ubuntu/layout_detection/venv/bin/eynollah", line 33, in <module>
    sys.exit(load_entry_point('eynollah', 'console_scripts', 'eynollah')())
  File "/home/ubuntu/layout_detection/venv/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/layout_detection/venv/lib/python3.7/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/layout_detection/venv/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ubuntu/layout_detection/venv/lib/python3.7/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/cli.py", line 190, in main
    eynollah.run()
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/eynollah.py", line 3162, in run
    order_text_new, id_of_texts_tot = self.do_order_of_regions(contours_only_text_parent, contours_only_text_parent_h, boxes, textline_mask_tot)
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/eynollah.py", line 2354, in do_order_of_regions
    return self.do_order_of_regions_full_layout(*args, **kwargs)
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/eynollah.py", line 2011, in do_order_of_regions_full_layout
    order_of_texts, id_of_texts = order_and_id_of_texts(con_inter_box, con_inter_box_h, matrix_of_orders, indexes_sorted, index_by_kind_sorted, kind_of_texts_sorted, ref_point)
  File "/home/ubuntu/layout_detection/eynollah/qurator/eynollah/utils/xml.py", line 79, in order_and_id_of_texts
    interest = indexes_sorted_1[indexes_sorted_1 == index_of_types_1[idx_textregion]]
IndexError: index 4 is out of bounds for axis 0 with size 4
sjscotti commented 2 years ago

I am getting a similar error from eynollah (main branch, not eynollah_light) for a single image being processed in a call from an OCR-D workflow. I am running on Windows, but I have had no problem with processing ~1000 images before I got one that gave this error. Here is the error ...

(NOTE:  I am not showing a lot of output from prior steps in the process, just from error and step just before it)
18:34:48.747 INFO eynollah - textline detection took 25.1s
18:37:00.806 INFO eynollah - slope_deskew: 0.1212121212121211
18:37:00.806 INFO eynollah - deskewing took 132.1s
18:37:00.951 INFO eynollah - detection of marginals took 0.1s
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 39ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 27ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 27ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 30ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 1s 781ms/step
18:47:48.584 ERROR eynollah - index 2 is out of bounds for axis 0 with size 2
Traceback (most recent call last):
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\eynollah.py", line 1423, in do_order_of_regions_full_layout
    order_of_texts, id_of_texts = order_and_id_of_texts(con_inter_box, con_inter_box_h, matrix_of_orders, indexes_sorted, index_by_kind_sorted, kind_of_texts_sorted, ref_point)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\utils\xml.py", line 79, in order_and_id_of_texts
    interest = indexes_sorted_1[indexes_sorted_1 == index_of_types_1[idx_textregion]]
IndexError: index 2 is out of bounds for axis 0 with size 2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Steve\anaconda3\envs\qurator\Scripts\ocrd-eynollah-segment.exe\__main__.py", line 7, in <module>
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\ocrd_cli.py", line 8, in main
    return ocrd_cli_wrap_processor(EynollahProcessor, *args, **kwargs)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\ocrd\decorators\__init__.py", line 88, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\ocrd\processor\helpers.py", line 88, in run_processor
    processor.process()
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\processor.py", line 58, in process
    Eynollah(**eynollah_kwargs).run()
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\eynollah.py", line 2551, in run
    order_text_new, id_of_texts_tot = self.do_order_of_regions(contours_only_text_parent, contours_only_text_parent_h, boxes, textline_mask_tot)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\eynollah.py", line 1838, in do_order_of_regions
    return self.do_order_of_regions_full_layout(*args, **kwargs)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\eynollah.py", line 1495, in do_order_of_regions_full_layout
    order_of_texts, id_of_texts = order_and_id_of_texts(con_inter_box, con_inter_box_h, matrix_of_orders, indexes_sorted, index_by_kind_sorted, kind_of_texts_sorted, ref_point)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\utils\xml.py", line 79, in order_and_id_of_texts
    interest = indexes_sorted_1[indexes_sorted_1 == index_of_types_1[idx_textregion]]
IndexError: index 4 is out of bounds for axis 0 with size 4

And attached is the image that triggers this error. Any suggestions? taqeotjqpmaelewuifmuemlkevxfdghg_wma-gateway008_1642260496363

sjscotti commented 2 years ago

Hi again! I found I can avoid this error if I set the OCR-D flag dpi to 360, which I estimated is about what this image really has as dpi resolution. Looking at the output during processing, with this flag setting, I get a message that there is no image enhancement...

(qurator) D:\qurator>ocrd-eynollah-segment -I OCR-D-IMG -O OCR-D-IMG-CROP-SEG -P models eynollah/models_eynollah -P dpi 360
08:35:26.129 INFO eynollah - INPUT FILE P_01042 (1/1)

08:35:26.633 INFO eynollah - Resizing and enhancing image...
08:35:26.633 INFO eynollah - Detected 360 DPI
1/1 [==============================] - 4s 4s/step
1/1 [==============================] - 1s 671ms/step
08:35:34.244 INFO eynollah - Found 3 columns ([[2.3108149e-04 3.8879987e-17 9.9976033e-01 7.3455469e-10 8.5315305e-06
  1.7594233e-11]])
08:35:34.253 INFO eynollah - Image was not enhanced.
08:35:34.259 INFO eynollah - Enhancing took 7.6s

where before it did enhance the image...

(qurator) D:\qurator>ocrd-eynollah-segment -I OCR-D-IMG -O OCR-D-IMG-CROP-SEG -P models eynollah/models_eynollah
18:32:13.295 INFO eynollah - INPUT FILE P_01042 (1/1)

18:32:13.588 INFO eynollah - Resizing and enhancing image...
18:32:13.588 INFO eynollah - Detected 230 DPI
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 1s 686ms/step
18:32:21.034 INFO eynollah - Found 3 columns ([[2.3108149e-04 3.8879987e-17 9.9976033e-01 7.3455469e-10 8.5315305e-06
  1.7594233e-11]])
1/1 [==============================] - 1s 887ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 26ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 25ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 23ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 21ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 22ms/step
18:32:40.309 INFO eynollah - Image was enhanced.
18:32:40.519 INFO eynollah - Enhancing took 26.9s

Is this helpful in tracing the bug for this issue? What exactly does image enhancement do and when is it needed in eynollah?

Thanks!

sjscotti commented 2 years ago

Hi again I have a follow-on to my last comment. I have another image that is from a small section of a newspaper that is greatly magnified (1 column wide, but dimensions are 3024 x 1564). I used the same OCR-D call that included the -P dpi 360 flag (even though I think the actual dpi is over 1300) and I got the following error, that is similar to the ones mentioned earlier in this issue...

(qurator) D:\qurator>ocrd-eynollah-segment -I OCR-D-IMG -O OCR-D-IMG-CROP-SEG -P models eynollah/models_eynollah -P dpi 360
12:37:16.560 INFO eynollah - INPUT FILE P_01370 (1/31)

12:37:16.923 INFO eynollah - Resizing and enhancing image...
12:37:16.923 INFO eynollah - Detected 360 DPI
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 1s 679ms/step
12:37:24.386 INFO eynollah - Found 1 columns ([[1.0000000e+00 0.0000000e+00 1.8512140e-36 4.6862523e-35 0.0000000e+00
  0.0000000e+00]])
12:37:24.386 INFO eynollah - Image was not enhanced.
12:37:24.402 INFO eynollah - Enhancing took 7.5s
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 38ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 18ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 34ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 1s 795ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
12:37:39.402 INFO eynollah - ratio_of_two_models: 98.16809348118764
12:37:39.665 INFO eynollah - Textregion detection took 15.3s
1/1 [==============================] - 1s 781ms/step
12:37:42.169 INFO eynollah - Graphics detection took 2.5s
1/1 [==============================] - 1s 813ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 18ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 22ms/step
12:37:47.077 INFO eynollah - textline detection took 4.9s
12:37:52.340 INFO eynollah - slope_deskew: -0.22727272727272663
12:37:52.340 INFO eynollah - deskewing took 5.3s
12:37:52.478 INFO eynollah - detection of marginals took 0.1s
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 16ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 18ms/step
1/1 [==============================] - 0s 31ms/step
1/1 [==============================] - 0s 22ms/step
1/1 [==============================] - 1s 802ms/step
12:40:13.832 ERROR eynollah - index 0 is out of bounds for axis 0 with size 0
Traceback (most recent call last):
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\eynollah.py", line 1423, in do_order_of_regions_full_layout
    order_of_texts, id_of_texts = order_and_id_of_texts(con_inter_box, con_inter_box_h, matrix_of_orders, indexes_sorted, index_by_kind_sorted, kind_of_texts_sorted, ref_point)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\utils\xml.py", line 79, in order_and_id_of_texts
    interest = indexes_sorted_1[indexes_sorted_1 == index_of_types_1[idx_textregion]]
IndexError: index 0 is out of bounds for axis 0 with size 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Steve\anaconda3\envs\qurator\Scripts\ocrd-eynollah-segment.exe\__main__.py", line 7, in <module>
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\ocrd_cli.py", line 8, in main
    return ocrd_cli_wrap_processor(EynollahProcessor, *args, **kwargs)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\ocrd\decorators\__init__.py", line 88, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\ocrd\processor\helpers.py", line 88, in run_processor
    processor.process()
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\processor.py", line 58, in process
    Eynollah(**eynollah_kwargs).run()
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\eynollah.py", line 2553, in run
    order_text_new, id_of_texts_tot = self.do_order_of_regions(contours_only_text_parent_d_ordered, contours_only_text_parent_h_d_ordered, boxes_d, textline_mask_tot_d)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\eynollah.py", line 1838, in do_order_of_regions
    return self.do_order_of_regions_full_layout(*args, **kwargs)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\eynollah.py", line 1495, in do_order_of_regions_full_layout
    order_of_texts, id_of_texts = order_and_id_of_texts(con_inter_box, con_inter_box_h, matrix_of_orders, indexes_sorted, index_by_kind_sorted, kind_of_texts_sorted, ref_point)
  File "C:\Users\Steve\anaconda3\envs\qurator\lib\site-packages\qurator\eynollah\utils\xml.py", line 79, in order_and_id_of_texts
    interest = indexes_sorted_1[indexes_sorted_1 == index_of_types_1[idx_textregion]]
IndexError: index 5 is out of bounds for axis 0 with size 5

So I tried adding an additional scaling flag as shown below, and that seemed to eliminate the error...

(qurator) D:\qurator>ocrd-eynollah-segment -I OCR-D-IMG -O OCR-D-IMG-CROP-SEG -P models eynollah/models_eynollah -P dpi 360  -P allow_scaling true
12:41:18.540 INFO eynollah - INPUT FILE P_01370 (1/31)

12:41:18.900 INFO eynollah - Resizing and enhancing image...
12:41:18.900 INFO eynollah - Detected 360 DPI
1/1 [==============================] - 3s 3s/step
1/1 [==============================] - 1s 678ms/step
12:41:26.502 INFO eynollah - Found 1 columns ([[1.0000000e+00 0.0000000e+00 1.8512140e-36 4.6862523e-35 0.0000000e+00
  0.0000000e+00]])
12:41:26.510 INFO eynollah - Image was not enhanced.
1/1 [==============================] - 1s 789ms/step
1/1 [==============================] - 1s 685ms/step
12:41:30.906 INFO eynollah - Found 1 columns ([[1.0000000e+00 0.0000000e+00 1.8512140e-36 4.6862523e-35 0.0000000e+00
  0.0000000e+00]])
12:41:31.010 INFO eynollah - Enhancing took 12.1s
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 28ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 1s 789ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 18ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
12:41:40.866 INFO eynollah - ratio_of_two_models: 97.8214677408546
12:41:40.993 INFO eynollah - Textregion detection took 10.0s
1/1 [==============================] - 1s 781ms/step
12:41:43.449 INFO eynollah - Graphics detection took 2.5s
1/1 [==============================] - 1s 814ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 20ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
12:41:46.847 INFO eynollah - textline detection took 3.4s
12:41:49.459 INFO eynollah - slope_deskew: -0.22727272727272663
12:41:49.459 INFO eynollah - deskewing took 2.6s
12:41:49.519 INFO eynollah - detection of marginals took 0.1s
1/1 [==============================] - 2s 2s/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 0s 24ms/step
1/1 [==============================] - 1s 815ms/step
12:44:08.610 INFO eynollah - Job done in 169.7s

I think the actual dpi of all the images I am processing will vary by an order of magnitude, so a single value of 360 won't be applicable. But the scaling seems to eliminate the run crashing. I am just trying different options to find a workaround, but it would be helpful to understand what is causing the error, and if the dpi and allow_scaling options will eliminate these kinds of errors from occurring in the future. Thanks!

vahidrezanezhad commented 2 years ago

Dear @sjscotti Thank you so much for your effort to find out source of the problem :) I am working currently on the issue and try to resolve it as soon as possible.

jbarth-ubhd commented 2 years ago

Similar problem:

02:33:45.972 ERROR eynollah - index 10 is out of bounds for axis 0 with size 10
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/eynollah.py", line 1445, in do_order_of_regions_full_layout
    order_of_texts_tot.append(int(order_by_con_main[tj1]))
IndexError: index 10 is out of bounds for axis 0 with size 10

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/ocrd-eynollah-segment", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/ocrd_cli.py", line 8, in main
    return ocrd_cli_wrap_processor(EynollahProcessor, *args, **kwargs)
  File "/build/core/ocrd/ocrd/decorators/__init__.py", line 88, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/build/core/ocrd/ocrd/processor/helpers.py", line 88, in run_processor
    processor.process()
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/processor.py", line 58, in process
    Eynollah(**eynollah_kwargs).run()
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/eynollah.py", line 2553, in run
    order_text_new, id_of_texts_tot = self.do_order_of_regions(contours_only_text_parent_d_ordered, contours_only_text_parent_h_d_ordered, boxes_d, textline_mask_tot_d)
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/eynollah.py", line 1838, in do_order_of_regions
    return self.do_order_of_regions_full_layout(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/eynollah.py", line 1517, in do_order_of_regions_full_layout
    order_of_texts_tot.append(int(order_by_con_main[tj1]))
IndexError: index 10 is out of bounds for axis 0 with size 10

Image original: https://digi.ub.uni-heidelberg.de/diglitData/v/boisseree1821bd1_-_000_Tafel_038cf.tif

(preview): grafik

Workflow:

ocrd workspace init  
ocrd workspace add -g P_00001 -G OCR-D-IMG -i OCR-D-IMG_00001 -m image/tiff OCR-D-IMG/00001.tif  
ocrd-olena-binarize -P k 0.10 -I OCR-D-IMG -O OCR-D-001  
ocrd-anybaseocr-crop -I OCR-D-001 -O OCR-D-002  
ocrd-olena-binarize -I OCR-D-002 -O OCR-D-003  
ocrd-cis-ocropy-deskew -P level-of-operation page -I OCR-D-003 -O OCR-D-004  
ocrd-eynollah-segment -I OCR-D-004 -O OCR-D-005 -P models $HOME/ocrd_models/eynollah/models_eynollah_renamed  
ocrd-calamari-recognize -I OCR-D-005 -O OCR-D-OCR -P checkpoint_dir "$HOME/ocrd_models/calamari/calamari_models/fraktur_historical_ligs"  
vahidrezanezhad commented 2 years ago

Similar problem:

02:33:45.972 ERROR eynollah - index 10 is out of bounds for axis 0 with size 10
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/eynollah.py", line 1445, in do_order_of_regions_full_layout
    order_of_texts_tot.append(int(order_by_con_main[tj1]))
IndexError: index 10 is out of bounds for axis 0 with size 10

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/ocrd-eynollah-segment", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/ocrd_cli.py", line 8, in main
    return ocrd_cli_wrap_processor(EynollahProcessor, *args, **kwargs)
  File "/build/core/ocrd/ocrd/decorators/__init__.py", line 88, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/build/core/ocrd/ocrd/processor/helpers.py", line 88, in run_processor
    processor.process()
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/processor.py", line 58, in process
    Eynollah(**eynollah_kwargs).run()
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/eynollah.py", line 2553, in run
    order_text_new, id_of_texts_tot = self.do_order_of_regions(contours_only_text_parent_d_ordered, contours_only_text_parent_h_d_ordered, boxes_d, textline_mask_tot_d)
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/eynollah.py", line 1838, in do_order_of_regions
    return self.do_order_of_regions_full_layout(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/qurator/eynollah/eynollah.py", line 1517, in do_order_of_regions_full_layout
    order_of_texts_tot.append(int(order_by_con_main[tj1]))
IndexError: index 10 is out of bounds for axis 0 with size 10

Image original: https://digi.ub.uni-heidelberg.de/diglitData/v/boisseree1821bd1_-_000_Tafel_038cf.tif

(preview): grafik

Workflow:

ocrd workspace init  
ocrd workspace add -g P_00001 -G OCR-D-IMG -i OCR-D-IMG_00001 -m image/tiff OCR-D-IMG/00001.tif  
ocrd-olena-binarize -P k 0.10 -I OCR-D-IMG -O OCR-D-001  
ocrd-anybaseocr-crop -I OCR-D-001 -O OCR-D-002  
ocrd-olena-binarize -I OCR-D-002 -O OCR-D-003  
ocrd-cis-ocropy-deskew -P level-of-operation page -I OCR-D-003 -O OCR-D-004  
ocrd-eynollah-segment -I OCR-D-004 -O OCR-D-005 -P models $HOME/ocrd_models/eynollah/models_eynollah_renamed  
ocrd-calamari-recognize -I OCR-D-005 -O OCR-D-OCR -P checkpoint_dir "$HOME/ocrd_models/calamari/calamari_models/fraktur_historical_ligs"  

I have no access to original image, so I couldnt regenerate the same error. Could you share the original image? Thank you.

sjscotti commented 2 years ago

Dear @sjscotti Thank you so much for your effort to find out source of the problem :) I am working currently on the issue and try to resolve it as soon as possible.

Would it also be helpful if I provided the original image for the run that I fixed by adding the allow_scaling option?

jbarth-ubhd commented 2 years ago

Had wrong file permissions... should work now.