cisocrgroup / ocrd_cis

OCR-D python tools
MIT License
33 stars 12 forks source link

Got exception using ocrd-cis-ocropy-resegment with method 'ccomps' #93

Open stefanCCS opened 2 years ago

stefanCCS commented 2 years ago

I have got the following exception (with loglevel 'trace') using ocrd-cis-ocropy-resegmentwith method 'ccomps':

15:16:17.857 INFO processor.OcropyResegment - Page "OCR-D-REG-DESKEW-4074_007817778_00001" uses 200.000000 DPI
15:16:17.908 DEBUG ocrd_utils.crop_image - cropping image to (1966, 595, 2151, 682)
15:16:17.926 DEBUG ocrd_utils.coords.shift_coordinates - shifting coordinates by [-1966  -595]
15:16:17.926 DEBUG ocrd.workspace.image_from_segment - segment 'TR-1' has orientation=0 skew=0.00
15:16:17.927 DEBUG ocrd.workspace.image_from_segment - Using AlternativeImage 3 {'', 'verticallinesremoved', 'binarized', 'deskewed'} for segment 'TR-1'
15:16:17.928 DEBUG ocrd.workspace.download_file - download_file <OcrdFile fileGrp=OCR-D-REG-VL ID=OCR-D-REG-VL-4074_007817778_00001_TR-1.IMG-DESKEW, mimetype=image/png, url=OCR-D-REG-VL/OCR-D-REG-VL_4074_007817778_00001_TR-1.IMG-DESKEW.png, local_filename=OCR-D-REG-VL/OCR-D-REG-VL_4074_007817778_00001_TR-1.IMG-DESKEW.png]/>  [_recursion_count=0]
15:16:17.929 DEBUG PIL.PngImagePlugin - STREAM b'IHDR' 16 13
15:16:17.929 DEBUG PIL.PngImagePlugin - STREAM b'IDAT' 41 977
15:16:17.930 DEBUG ocrd_utils.coords.shift_coordinates - shifting coordinates by [-92.5 -43.5]
15:16:17.930 DEBUG ocrd_utils.coords.rotate_coordinates - rotating coordinates by 0.00° around [92.5 43.5]
15:16:17.931 DEBUG ocrd_utils.coords.shift_coordinates - shifting coordinates by [92.5 43.5]
15:16:17.931 DEBUG ocrd_utils.coords.shift_coordinates - shifting coordinates by [0 0]
15:16:17.940 DEBUG processor.OcropyResegment - unmasking area of text region "TR-1" for "TR-1"
15:16:17.947 DEBUG processor.OcropyResegment - calculating connected component and distance transforms for "TR-1"
15:16:17.948 DEBUG processor.OcropyResegment - estimated scale: 34
Traceback (most recent call last):
  File "/home/ocrdadmin/ocrd_all/venv/bin/ocrd-cis-ocropy-resegment", line 8, in <module>
    sys.exit(ocrd_cis_ocropy_resegment())
  File "/home/ocrdadmin/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/ocrdadmin/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ocrdadmin/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/lib/python3.6/site-packages/ocrd_cis/ocropy/cli.py", line 38, in ocrd_cis_ocropy_resegment
    return ocrd_cli_wrap_processor(OcropyResegment, *args, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/lib/python3.6/site-packages/ocrd/decorators/__init__.py", line 88, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/home/ocrdadmin/ocrd_all/venv/lib/python3.6/site-packages/ocrd/processor/helpers.py", line 88, in run_processor
    processor.process()
  File "/home/ocrdadmin/ocrd_all/venv/lib/python3.6/site-packages/ocrd_cis/ocropy/resegment.py", line 166, in process
    self._process_segment(region, region_image, region_coords, page_id, zoom, lines, ignore)
  File "/home/ocrdadmin/ocrd_all/venv/lib/python3.6/site-packages/ocrd_cis/ocropy/resegment.py", line 259, in _process_segment
    distances[i] = distances[i] / distances[i].max() * 255
FloatingPointError: invalid value encountered in true_divide

The PAGE looks like this:

  <pc:Page imageFilename="OCR-D-IMG/4074_007817778_00001.tif" imageWidth="4619" imageHeight="3312" orientation="0.">
    <pc:AlternativeImage filename="OCR-D-BIN/OCR-D-BIN_4074_007817778_00001.IMG-BIN.png" comments=",binarized" />
    <pc:TextRegion id="TR-1" orientation="0.">
      <pc:AlternativeImage filename="OCR-D-BIN-REG/OCR-D-BIN-REG-4074_007817778_00001_TR-1.IMG-BIN.png" comments=",binarized" />
      <pc:AlternativeImage filename="OCR-D-REG-DESKEW/OCR-D-REG-DESKEW-4074_007817778_00001_TR-1.IMG-DESKEW.png" comments=",binarized,deskewed" />
      <pc:AlternativeImage filename="OCR-D-REG-VL/OCR-D-REG-VL_4074_007817778_00001_TR-1.IMG-DESKEW.png" comments=",binarized,deskewed,verticallinesremoved" />
      <pc:Coords points="1966,595 1966,682 2151,682 2151,595" />
      <pc:TextLine id="TR-1_line0001">
        <pc:Coords points="1966,595 1966,682 2151,682 2151,595" />
        <pc:TextEquiv>
          <pc:Unicode>1889</pc:Unicode>
        </pc:TextEquiv>
      </pc:TextLine>

Please clarify ...

bertsky commented 3 months ago

Fixed in https://github.com/cisocrgroup/ocrd_cis/pull/87/commits/8d65708cc8ee6f42d00796d9dc1ed441b7cd7474 (part of #87).