qurator-spk / eynollah

Document Layout Analysis
Apache License 2.0
328 stars 26 forks source link

Expected Ptr<cv::UMat> for argument 'contour' #110

Open jbarth-ubhd opened 1 year ago

jbarth-ubhd commented 1 year ago

Did build a new singularity image from docker ocrd/all:maximum today, approx. 14:00 CEST.

using this image (with OCR-D): https://digi.ub.uni-heidelberg.de/diglitData/v/duerer1527--aa--deskew.png

and using this command: ocrd-eynollah-segment -P models default -I OCR-D-004 -O OCR-D-005 # ...004 contains png above

I'll get this error message:

15:05:05.439 ERROR ocrd.processor.helpers.run_processor - Failure in processor 'ocrd-eynollah-segment'
Traceback (most recent call last):
  File "/build/core/ocrd/ocrd/processor/helpers.py", line 128, in run_processor
    processor.process()
  File "/build/eynollah/qurator/eynollah/processor.py", line 59, in process
    Eynollah(**eynollah_kwargs).run()
  File "/build/eynollah/qurator/eynollah/eynollah.py", line 2277, in run
    cx_bigest, cy_biggest, _, _, _, _, _ = find_new_features_of_contours(contours_only_text_parent)
  File "/build/eynollah/qurator/eynollah/utils/contour.py", line 80, in find_new_features_of_contours
    areas_main = np.array([cv2.contourArea(contours_main[j]) for j in range(len(contours_main))])
  File "/build/eynollah/qurator/eynollah/utils/contour.py", line 80, in <listcomp>
    areas_main = np.array([cv2.contourArea(contours_main[j]) for j in range(len(contours_main))])
TypeError: Expected Ptr<cv::UMat> for argument 'contour'
cneud commented 1 year ago

Thanks for reporting! Possibly fixed now by @bertsky's https://github.com/qurator-spk/eynollah/pull/109 - we will make a new release asap.

jbarth-ubhd commented 1 year ago

Did build a new ocrd.sif from actual ocrd/all:maximum, still the same error:

hd_xxxxx@o05i14 aa]$ ls -l ~/ocrd.sif
lrwxrwxrwx 1 hd_xxxxx hd_hd 19 Jun 26 16:40 /home/hd/hd_hd/hd_xxxxx/ocrd.sif -> ocrd-2023-06-21.sif
[hd_xxxxx@o05i14 aa]$ ls -lrt ~/*.sif
-rwxr-xr-x 1 hd_xxxxx hd_hd 8235646976 Jun 26 11:56 /home/hd/hd_hd/hd_xxxxx/ocrd-2023-06-21.sif
lrwxrwxrwx 1 hd_xxxxx hd_hd         19 Jun 26 16:40 /home/hd/hd_hd/hd_xxxxx/ocrd.sif -> ocrd-2023-06-21.sif

16:49:46.885 INFO eynollah - detection of marginals took 0.7s
16:50:56.153 ERROR ocrd.processor.helpers.run_processor - Failure in processor 'ocrd-eynollah-segment'
Traceback (most recent call last):
  File "/build/core/ocrd/ocrd/processor/helpers.py", line 128, in run_processor
    processor.process()
  File "/build/eynollah/qurator/eynollah/processor.py", line 59, in process
    Eynollah(**eynollah_kwargs).run()
  File "/build/eynollah/qurator/eynollah/eynollah.py", line 2277, in run
    cx_bigest, cy_biggest, _, _, _, _, _ = find_new_features_of_contours(contours_only_text_parent)
  File "/build/eynollah/qurator/eynollah/utils/contour.py", line 80, in find_new_features_of_contours
    areas_main = np.array([cv2.contourArea(contours_main[j]) for j in range(len(contours_main))])
  File "/build/eynollah/qurator/eynollah/utils/contour.py", line 80, in <listcomp>
    areas_main = np.array([cv2.contourArea(contours_main[j]) for j in range(len(contours_main))])
TypeError: Expected Ptr<cv::UMat> for argument 'contour'
Traceback (most recent call last):
  File "/usr/local/bin/ocrd-eynollah-segment", line 33, in <module>
    sys.exit(load_entry_point('eynollah', 'console_scripts', 'ocrd-eynollah-segment')())
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/build/eynollah/qurator/eynollah/ocrd_cli.py", line 8, in main
    return ocrd_cli_wrap_processor(EynollahProcessor, *args, **kwargs)
  File "/build/core/ocrd/ocrd/decorators/__init__.py", line 151, in ocrd_cli_wrap_processor
    run_processor(processorClass, mets_url=mets, workspace=workspace, **kwargs)
  File "/build/core/ocrd/ocrd/processor/helpers.py", line 131, in run_processor
    raise err
  File "/build/core/ocrd/ocrd/processor/helpers.py", line 128, in run_processor
    processor.process()
  File "/build/eynollah/qurator/eynollah/processor.py", line 59, in process
    Eynollah(**eynollah_kwargs).run()
  File "/build/eynollah/qurator/eynollah/eynollah.py", line 2277, in run
    cx_bigest, cy_biggest, _, _, _, _, _ = find_new_features_of_contours(contours_only_text_parent)
  File "/build/eynollah/qurator/eynollah/utils/contour.py", line 80, in find_new_features_of_contours
    areas_main = np.array([cv2.contourArea(contours_main[j]) for j in range(len(contours_main))])
  File "/build/eynollah/qurator/eynollah/utils/contour.py", line 80, in <listcomp>
    areas_main = np.array([cv2.contourArea(contours_main[j]) for j in range(len(contours_main))])
TypeError: Expected Ptr<cv::UMat> for argument 'contour'
Command exited with non-zero status 1
2464.57user 286.79system 7:56.56elapsed 577%CPU (0avgtext+0avgdata 9564560maxresident)k
339946inputs+0outputs (2999major+8036940minor)pagefaults 0swaps
cneud commented 11 months ago

I can confirm this is already fixed in main but there are other changes that broke the OCR-D CLI and are currently blocking this from integration in OCR-D. We will try to resolve and include these in a new release asap.

$ eynollah -i ./duerer1527--aa--deskew.png -o . -m ~/models/models_eynollah
23:35:14.667 INFO eynollah - Resizing and enhancing image...
23:35:14.667 INFO eynollah - Detected 230 DPI
23:35:31.629 INFO eynollah - Found 1 columns ([[1. 0. 0. 0. 0. 0.]])
23:35:53.642 INFO eynollah - Image was enhanced.
23:35:53.685 INFO eynollah - Enhancing took 39.0s
23:37:38.153 INFO eynollah - ratio_of_two_models: 99.9265450462504
23:37:38.398 INFO eynollah - Textregion detection took 104.7s
23:37:39.060 INFO eynollah - Graphics detection took 0.7s
23:38:07.196 INFO eynollah - textline detection took 28.1s
23:38:11.427 INFO eynollah - slope_deskew: -0.12°
23:38:11.427 INFO eynollah - deskewing took 4.2s
23:38:11.561 INFO eynollah - detection of marginals took 0.1s
23:38:12.081 INFO eynollah - num_col_classifier: 1
23:38:12.848 INFO eynollah - detecting boxes took 0.8s
23:38:15.595 INFO eynollah - Job done in 180.9s
23:38:15.595 INFO eynollah.writer - output filename: './duerer1527--aa--deskew.xml'

duerer1527--aa--deskew.zip

jbarth-ubhd commented 4 months ago

Problem is still there: https://digi.ub.uni-heidelberg.de/diglitData/v/christliche_kunstblaetter1862--40a--eynollah.zip (see run.sh for workflow & ocrd.log for log)

cneud commented 4 months ago

Problem is still there

Thanks and yes unfortunately - this is because OCR-D still uses an older version of Eynollah, which does not include the fix provided in https://github.com/qurator-spk/eynollah/pull/109/commits/867a7261de27bec7efc7e7add80f10ae18bc419a.

We must first fix https://github.com/qurator-spk/eynollah/issues/106 before making a next release that is compatible with OCR-D again.