OCR-D / quiver-back-end

The back end of the OCR-D quality dashboard webapp.
MIT License
1 stars 2 forks source link

The parameter leads to the following error: #50

Open mweidling opened 1 year ago

mweidling commented 1 year ago

"tesserocr-segment -I OCR-D-BIN-DENOISE-DESKEW -O OCR-D-SEG -P shrink_polygons true"

          The parameter leads to the following error:
Error executing process > 'ocrd_tesserocr_segment_4'

Caused by:
  Process `ocrd_tesserocr_segment_4` terminated with an error exit status (1)

Command executed:

  docker run --rm -u $(id -u) -v /some/path/quiver-back-end/workflows/workspaces/16_ant_complex_slower_processors_ocr/:/ocrd-workspace -v $HOME/ocrd_models:/usr/local/share/ocrd-resources -w /ocrd-workspace -- ocrd/all:maximum ocrd-tesserocr-segment -I OCR-D-BIN-DENOISE-DESKEW -O OCR-D-SEG -P shrink_polygons true

Command exit status:
  1

Command output:
  (empty)

Command error:
  19:11:43.723 INFO processor.TesserocrSegment - INPUT FILE 0 / phys_0008
  19:11:43.774 INFO processor.TesserocrSegment - Page 'phys_0008' images will use 300 DPI from image meta-data
  19:11:43.774 INFO processor.TesserocrSegment - Processing page 'phys_0008'
  19:11:44.066 INFO ocrd.workspace.save_image_file - created file ID: OCR-D-SEG_0008.IMG-BIN, file_grp: OCR-D-SEG, path: OCR-D-SEG/OCR-D-SEG_0008.IMG-BIN.png
  Traceback (most recent call last):
    File "/usr/local/bin/ocrd-tesserocr-segment", line 8, in <module>
      sys.exit(ocrd_tesserocr_segment())
    File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1128, in __call__
      return self.main(*args, **kwargs)
    File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1053, in main
      rv = self.invoke(ctx)
    File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1395, in invoke
      return ctx.invoke(self.callback, **ctx.params)
    File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 754, in invoke
      return __callback(*args, **kwargs)
    File "/usr/local/lib/python3.6/site-packages/ocrd_tesserocr/cli.py", line 18, in ocrd_tesserocr_segment
      return ocrd_cli_wrap_processor(TesserocrSegment, *args, **kwargs)
    File "/build/core/ocrd/ocrd/decorators/__init__.py", line 117, in ocrd_cli_wrap_processor
      run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
    File "/build/core/ocrd/ocrd/processor/helpers.py", line 107, in run_processor
      processor.process()
    File "/usr/local/lib/python3.6/site-packages/ocrd_tesserocr/segment.py", line 69, in process
      return self.recognizer.process()
    File "/usr/local/lib/python3.6/site-packages/ocrd_tesserocr/recognize.py", line 445, in process
      self._process_regions_in_page(tessapi.GetIterator(), page, page_coords, pcgts_mapping, dpi)
    File "/usr/local/lib/python3.6/site-packages/ocrd_tesserocr/recognize.py", line 525, in _process_regions_in_page
      for symbol in iterate_level(it, RIL.SYMBOL, parent=RIL.BLOCK)])
    File "/usr/local/lib/python3.6/site-packages/ocrd_tesserocr/recognize.py", line 1456, in join_polygons
      for poly in polygons]))
    File "/usr/local/lib/python3.6/site-packages/ocrd_tesserocr/recognize.py", line 1456, in <listcomp>
      for poly in polygons]))
  AttributeError: 'list' object has no attribute 'type'

I'm using the maximum image that has been created last week (ID 8784a41dc959) which yields Version 0.16.0, ocrd/core 2.44.0 for ocrd-tesserocr-segment --version. Smuggling it in here was kind of a quick fix, admittedly.

_Originally posted by @mweidling in https://github.com/OCR-D/quiver-back-end/pull/43#discussion_r1050045210_

kba commented 1 year ago

Can we debug this in ocrd_tesserocr? What image leads to this?

mweidling commented 1 year ago

What image leads to this?

OCR-D-IMG_0008.tif of the 16_ant_complex collection in quiver-data.