qurator-spk / eynollah

Document Layout Analysis
Apache License 2.0
332 stars 27 forks source link

Unable to process document due to ValueError: attempt to get argmax of an empty sequence #38

Closed mikegerber closed 3 years ago

mikegerber commented 3 years ago

For this workspace:

PPN729186350.zip

I get the following error:


Traceback (most recent call last):
  File "/usr/local/bin/ocrd-eynollah-segment", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/qurator/eynollah/ocrd_cli.py", line 8, in main
    return ocrd_cli_wrap_processor(EynollahProcessor, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ocrd/decorators/__init__.py", line 91, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ocrd/processor/helpers.py", line 72, in run_processor
    processor.process()
  File "/usr/local/lib/python3.6/dist-packages/qurator/eynollah/processor.py", line 57, in process
    Eynollah(**eynollah_kwargs).run()
  File "/usr/local/lib/python3.6/dist-packages/qurator/eynollah/eynollah.py", line 1744, in run
    contours_biggest = contours_only_text_parent[np.argmax(areas_cnt_text)]
  File "<__array_function__ internals>", line 6, in argmax
  File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 1186, in argmax
    return _wrapfunc(a, 'argmax', axis=axis, out=out)
  File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 61, in _wrapfunc
    return bound(*args, **kwds)
ValueError: attempt to get argmax of an empty sequence

Command line used:

ocrd-eynollah-segment --overwrite -I OCR-D-IMG-BIN -O OCR-D-SEG-LINE -P models /var/lib/eynollah
mikegerber commented 3 years ago

Same error for this workspace:

PPN719059895.zip

mikegerber commented 3 years ago

Same error for this workspace:

PPN1039827209.zip

kba commented 3 years ago

@vahidrezanezhad fixed this c4b2c71, so we just need a CHANGELOG.md entry and are ready for a new release :-)

mikegerber commented 3 years ago

Same error for this workspace:

PPN1039827209.zip

Still get a similar error for these files.

Traceback (most recent call last):
  File "/usr/local/bin/ocrd-eynollah-segment", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/dist-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/qurator/eynollah/ocrd_cli.py", line 8, in main
    return ocrd_cli_wrap_processor(EynollahProcessor, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ocrd/decorators/__init__.py", line 91, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/ocrd/processor/helpers.py", line 72, in run_processor
    processor.process()
  File "/usr/local/lib/python3.6/dist-packages/qurator/eynollah/processor.py", line 57, in process
    Eynollah(**eynollah_kwargs).run()
  File "/usr/local/lib/python3.6/dist-packages/qurator/eynollah/eynollah.py", line 1894, in run
    contours_biggest_d = contours_only_text_parent_d[np.argmax(areas_cnt_text_d)]
  File "<__array_function__ internals>", line 6, in argmax
  File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 1186, in argmax
    return _wrapfunc(a, 'argmax', axis=axis, out=out)
  File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 61, in _wrapfunc
    return bound(*args, **kwds)
ValueError: attempt to get argmax of an empty sequence

Notice it's in a different part of the code.

mikegerber commented 3 years ago

It's for the third file but maybe try to run ocrd-eynollah-segment --overwrite -I OCR-D-IMG-BIN -O OCR-D-SEG-LINE -P models /var/lib/eynollah to reproduce.

mikegerber commented 3 years ago

Latest main including 799a7c7 makes ocrd-eynollah-segment run on all of the problematic files!

borat-thumbs