ocrd-anybaseocr-crop fails if expected EXIF information in PNG image is missing

stweil commented 4 years ago

I tried an OCR-D workflow and failed:

ocrd-anybaseocr-crop -I OCR-D-IMG-PNG -O OCR-D-SEG-PAGE-anyocr -p OCR-D-SEG-PAGE-anyocr.json 2>&1 | tee OCR-D-SEG-PAGE-anyocr.log && touch -c OCR-D-SEG-PAGE-anyocr || { rm -fr OCR-D-SEG-PAGE-anyocr.json OCR-D-SEG-PAGE-anyocr; exit 1; }
22:45:58.672 INFO matplotlib.font_manager - generated new fontManager
Using TensorFlow backend.
22:46:02.518 INFO OcrdAnybaseocrCropper - No output file group for images specified, falling back to 'OCR-D-IMG-CROP'
22:46:02.530 INFO OcrdAnybaseocrCropper - INPUT FILE 0 / PHYS_0001
OUTPUT FILE  OCR-D-SEG-PAGE-anyocr
Traceback (most recent call last):
  File "/venv/bin/ocrd-anybaseocr-crop", line 8, in <module>
    sys.exit(ocrd_anybaseocr_cropping())
  File "/venv/lib/python3.7/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/venv/lib/python3.7/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/venv/lib/python3.7/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/venv/lib/python3.7/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/venv/lib/python3.7/site-packages/ocrd_anybaseocr/cli/cli.py", line 27, in ocrd_anybaseocr_cropping
    return ocrd_cli_wrap_processor(OcrdAnybaseocrCropper, *args, **kwargs)
  File "/venv/lib/python3.7/site-packages/ocrd/decorators.py", line 54, in ocrd_cli_wrap_processor
    run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
  File "/venv/lib/python3.7/site-packages/ocrd/processor/base.py", line 56, in run_processor
    processor.process()
  File "/venv/lib/python3.7/site-packages/ocrd_anybaseocr/cli/ocrd_anybaseocr_cropping.py", line 434, in process
    pcgts = page_from_file(self.workspace.download_file(input_file))
  File "/venv/lib/python3.7/site-packages/ocrd_modelfactory/__init__.py", line 75, in page_from_file
    return page_from_image(input_file)
  File "/venv/lib/python3.7/site-packages/ocrd_modelfactory/__init__.py", line 47, in page_from_image
    exif = exif_from_filename(input_file.local_filename)
  File "/venv/lib/python3.7/site-packages/ocrd_modelfactory/__init__.py", line 32, in exif_from_filename
    return OcrdExif(Image.open(image_filename))
  File "/venv/lib/python3.7/site-packages/ocrd_models/ocrd_exif.py", line 42, in __init__
    self.resolutionUnit = 'cm' if img.tag.get(296) == 3 else 'inches'
AttributeError: 'PngImageFile' object has no attribute 'tag'
make: *** [Makefile:298: OCR-D-SEG-PAGE-anyocr] Fehler 1

bertsky commented 4 years ago

@stweil this looks like an old error, fixed in OCR-D/core@6cc6a1b872cd0ac06342e251526be64d1d432e21 (included since at least v2.2.0). Are you sure you installed the latest version?

stweil commented 4 years ago

Usually I install the latest version before reporting an issue, but this one is rather old, so I simply don't know. I close the issue for now and will reopen it if it should occur again.

OCR-D / ocrd_anybaseocr

ocrd-anybaseocr-crop fails if expected EXIF information in PNG image is missing #49