Closed jbarth-ubhd closed 3 years ago
@jbarth-ubhd thanks for the bug reports, also in ocrd_anybaseocr.
Here's the log as a string:
15:21:27.753 INFO processor.PixelClassifierSegmentation - INPUT FILE 0 / P_00001
2020-08-28 15:21:45.422048: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-08-28 15:21:45.561837: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2700000000 Hz
2020-08-28 15:21:45.568756: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x99c25a0 executing computations on platform Host. Devices:
2020-08-28 15:21:45.568799: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
Traceback (most recent call last):
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/bin/ocrd-pc-segmentation", line 8, in <module>
sys.exit(ocrd_pc_segmentation())
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/ocrd_pc_segmentation/cli.py", line 10, in ocrd_pc_segmentation
return ocrd_cli_wrap_processor(PixelClassifierSegmentation, *args, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/ocrd/decorators.py", line 102, in ocrd_cli_wrap_processor
run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/ocrd/processor/helpers.py", line 69, in run_processor
processor.process()
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/ocrd_pc_segmentation/ocrd_segmentation.py", line 108, in process
gpu_allow_growth, resize_height)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/ocrd_pc_segmentation/ocrd_segmentation.py", line 141, in _process_page
gpu_allow_growth=gpu_allow_growth,
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/ocr4all_pixel_classifier/scripts/find_segments.py", line 111, in predict_masks
return predictor.predict_masks(data)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/ocr4all_pixel_classifier/lib/predictor.py", line 66, in predict_masks
logit, prob, pred = self.network.predict_single_data(data)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/ocr4all_pixel_classifier/lib/network.py", line 256, in predict_single_data
image_to_batch(data.binary)])[0, :, :, :]
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 1135, in predict_on_batch
return training_v2_utils.predict_on_batch(self, x)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 359, in predict_on_batch
x, extract_tensors_from_dataset=True)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 2472, in _standardize_user_data
exception_prefix='input')
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/local/sub-venv/headless-tf21/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_utils.py", line 574, in standardize_input_data
str(data_shape))
ValueError: Error when checking input: expected input_1 to have shape (None, None, 1) but got array with shape (3177, 2350, 2)
@jbarth-ubhd if you update to core 2.15.0, you'll get better readable exceptions.
core 2.15.0 is not yet in https://github.com/OCR-D/ocrd_all
core 2.15.0 is not in https://github.com/OCR-D/ocrd_all
Yes, I'll open a PR later today.
core 2.15.0 is not in https://github.com/OCR-D/ocrd_all
Yes, I'll open a PR later today.
@jbarth-ubhd if you update to core 2.15.0, you'll get better readable exceptions.
Sorry, I was mistaken, I forgot to fix this in 2.15.0, https://github.com/OCR-D/core/pull/583 will correct that in next release.
Hi, thank you for the report. It seems the images received as input contain an alpha channel. I just published a new version (0.2.1), which should ignore any alpha channel and does no longer cause the error for me. Can you confirm that it works for you too?
ok make all
in ocrd_all
after git pull https://github.com/ocr-d-modul-2-segmentierung/ocrd-pixelclassifier-segmentation master # within ocrd_pc_segmentation
did reset the version?!
Now I've done
cd ocr_pc_segmentation
git pull https://... master
make install # within ocrd_pc_segmentation
# Now:
(venv) jb@pers109:/usr/local/ocrd_all> ocrd-pc-segmentation -V
Version 0.2.1, ocrd/core 2.15.0
next try...
with the same workflow as above:
...
18:15:58.362 INFO ocrd.task_sequence.run_tasks - Start processing task 'pc-segmentation -I OCR-D-N5 -O OCR-D-N6 -p '{"overwrite_regions": true, "xheight": 8, "model": "__DEFAULT__", "gpu_allow_growth": false, "resize_height": 300}''
18:22:20.636 INFO ocrd.task_sequence.run_tasks - Finished processing task 'pc-segmentation -I OCR-D-N5 -O OCR-D-N6 -p '{"overwrite_regions": true, "xheight": 8, "model": "__DEFAULT__", "gpu_allow_growth": false, "resize_height": 300}''
18:22:20.652 INFO ocrd.task_sequence.run_tasks - Start processing task 'cis-ocropy-deskew -I OCR-D-N6 -O OCR-D-N7 -p '{"level-of-operation": "region", "maxskew": 5.0}''
Traceback (most recent call last):
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/bin/ocrd", line 8, in <module>
sys.exit(cli())
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/ocrd/cli/process.py", line 27, in process_cli
run_tasks(mets, log_level, page_id, tasks, overwrite)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/ocrd/task_sequence.py", line 153, in run_tasks
raise Exception("%s exited with non-zero return value %s. STDOUT:
%s
STDERR:
%s" % (task.executable, returncode, out, err))
Exception: ocrd-cis-ocropy-deskew exited with non-zero return value -9. STDOUT:
b''
STDERR:
b"18:22:25.390 INFO processor.OcropyDeskew - INPUT FILE 0 / P_00001
18:22:26.110 WARNING ocrd_utils.crop_image - crop coordinates ((0, 0, 24780, 33644)) exceed image (2350x3177)
18:22:28.324 INFO processor.OcropyDeskew - About to deskew region 'region0000'
18:56:32.657 WARNING ocrolib - cannot estimate binarization thresholds (is the image empty?)
18:56:40.066 INFO processor.OcropyDeskew - Found angle for region 'region0000': 0.0
18:56:55.205 INFO ocrd.workspace - created file ID: OCR-D-N7_00001_region0000.IMG-DESKEW, file_grp: OCR-D-N7, path: OCR-D-N7/OCR-D-N7_00001_region0000.IMG-DESKEW.png
18:56:55.333 INFO processor.OcropyDeskew - created file ID: OCR-D-N7_00001, file_grp: OCR-D-N7, path: OCR-D-N7/OCR-D-N7_00001.xml
18:56:55.334 INFO processor.OcropyDeskew - INPUT FILE 1 / P_00002
18:56:56.640 WARNING ocrd_utils.crop_image - crop coordinates ((0, 0, 35650, 50934)) exceed image (2745x3909)
18:56:57.848 INFO processor.OcropyDeskew - About to deskew region 'region0000'
20:21:19.219 WARNING ocrolib - cannot estimate binarization thresholds (is the image empty?)
20:21:28.239 INFO processor.OcropyDeskew - Found angle for region 'region0000': 0.0
20:21:45.377 INFO ocrd.workspace - created file ID: OCR-D-N7_00002_region0000.IMG-DESKEW, file_grp: OCR-D-N7, path: OCR-D-N7/OCR-D-N7_00002_region0000.IMG-DESKEW.png
20:21:45.481 INFO processor.OcropyDeskew - created file ID: OCR-D-N7_00002, file_grp: OCR-D-N7, path: OCR-D-N7/OCR-D-N7_00002.xml
20:21:45.482 INFO processor.OcropyDeskew - INPUT FILE 2 / P_00003
20:21:46.795 WARNING ocrd_utils.crop_image - crop coordinates ((0, 0, 42576, 60265)) exceed image (3008x4252)
20:21:50.885 INFO processor.OcropyDeskew - About to deskew region 'region0000'
"
Command exited with non-zero status 1
But none of the original images has an alpha channel.
And with this workflow:
ocrd-create-mets.xml
( /usr/bin/time ocrd process \
"olena-binarize -I OCR-D-IMG -O OCR-D-N1 -P impl wolf" \
"anybaseocr-crop -I OCR-D-N1 -O OCR-D-N2" \
"olena-binarize -I OCR-D-N2 -O OCR-D-N3 -P impl wolf" \
"cis-ocropy-denoise -I OCR-D-N3 -O OCR-D-N4 -P level-of-operation page" \
"cis-ocropy-deskew -I OCR-D-N4 -O OCR-D-N5 -P level-of-operation page" \
"pc-segmentation -I OCR-D-N5 -O OCR-D-N6" \
"cis-ocropy-deskew -I OCR-D-N6 -O OCR-D-N7 -P level-of-operation region" \
"cis-ocropy-clip -I OCR-D-N7 -O OCR-D-N8 -P level-of-operation region" \
"cis-ocropy-segment -I OCR-D-N8 -O OCR-D-N9 -P level-of-operation region" \
"segment-repair -I OCR-D-N9 -O OCR-D-N10 -P sanitize true" \
"cis-ocropy-dewarp -I OCR-D-N10 -O OCR-D-N11" \
"calamari-recognize -I OCR-D-N11 -O OCR-D-OCR -P checkpoint /usr/local/ocrd_models/calamari/calamari_models-0.3/fraktur_19th_century/*.ckpt.json"
) >cmd.log 2>&1
this error:
...
18:16:38.524 INFO ocrd.task_sequence.run_tasks - Start processing task 'pc-segmentation -I OCR-D-N5 -O OCR-D-N6 -p '{"overwrite_regions": true, "xheight": 8, "model": "__DEFAULT__", "gpu_allow_growth": false, "resize_height": 300}''
18:23:15.869 INFO ocrd.task_sequence.run_tasks - Finished processing task 'pc-segmentation -I OCR-D-N5 -O OCR-D-N6 -p '{"overwrite_regions": true, "xheight": 8, "model": "__DEFAULT__", "gpu_allow_growth": false, "resize_height": 300}''
18:23:15.909 INFO ocrd.task_sequence.run_tasks - Start processing task 'cis-ocropy-deskew -I OCR-D-N6 -O OCR-D-N7 -p '{"level-of-operation": "region", "maxskew": 5.0}''
10:37:12.369 INFO ocrd.task_sequence.run_tasks - Finished processing task 'cis-ocropy-deskew -I OCR-D-N6 -O OCR-D-N7 -p '{"level-of-operation": "region", "maxskew": 5.0}''
10:37:14.777 INFO ocrd.task_sequence.run_tasks - Start processing task 'cis-ocropy-clip -I OCR-D-N7 -O OCR-D-N8 -p '{"level-of-operation": "region", "dpi": 0, "min_fraction": 0.7}''
10:37:53.545 INFO ocrd.task_sequence.run_tasks - Finished processing task 'cis-ocropy-clip -I OCR-D-N7 -O OCR-D-N8 -p '{"level-of-operation": "region", "dpi": 0, "min_fraction": 0.7}''
10:37:53.606 INFO ocrd.task_sequence.run_tasks - Start processing task 'cis-ocropy-segment -I OCR-D-N8 -O OCR-D-N9 -p '{"level-of-operation": "region", "dpi": 0, "maxcolseps": 20, "maxseps": 20, "maximages": 10, "csminheight": 4, "hlminwidth": 10, "gap_height": 0.01, "gap_width": 1.5, "overwrite_order": true, "overwrite_separators": true, "overwrite_regions": true, "overwrite_lines": true, "spread": 2.4}''
Traceback (most recent call last):
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/bin/ocrd", line 8, in <module>
sys.exit(cli())
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/ocrd/cli/process.py", line 27, in process_cli
run_tasks(mets, log_level, page_id, tasks, overwrite)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/ocrd/task_sequence.py", line 153, in run_tasks
raise Exception("%s exited with non-zero return value %s. STDOUT:
%s
STDERR:
%s" % (task.executable, returncode, out, err))
Exception: ocrd-cis-ocropy-segment exited with non-zero return value 1. STDOUT:
b''
STDERR:
b'10:37:54.662 INFO processor.OcropySegment - INPUT FILE 0 / P_00001
10:37:55.156 INFO processor.OcropySegment - Page "OCR-D-N8_00001" uses 300.000000 DPI
10:37:55.263 WARNING ocrd_utils.crop_image - crop coordinates ((0, 0, 24780, 33644)) exceed image (2350x3177)
Traceback (most recent call last):
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/bin/ocrd-cis-ocropy-segment", line 8, in <module>
sys.exit(ocrd_cis_ocropy_segment())
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/ocrd_cis/ocropy/cli.py", line 53, in ocrd_cis_ocropy_segment
return ocrd_cli_wrap_processor(OcropySegment, *args, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/ocrd/decorators.py", line 102, in ocrd_cli_wrap_processor
run_processor(processorClass, ocrd_tool, mets, workspace=workspace, **kwargs)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/ocrd/processor/helpers.py", line 69, in run_processor
processor.process()
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/ocrd_cis/ocropy/segment.py", line 378, in process
region, page_image, page_coords, feature_selector=\'binarized\')
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/ocrd/workspace.py", line 754, in image_from_segment
segment_image = self._resolve_image_as_pil(alternative_image.get_filename())
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/ocrd/workspace.py", line 280, in _resolve_image_as_pil
pil_image = Image.open(image_filename)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/PIL/Image.py", line 2916, in open
im = _open_core(fp, filename, prefix)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/PIL/Image.py", line 2903, in _open_core
_decompression_bomb_check(im.size)
File "/dwork/ocrd-schroot-ubuntu-eoan/usr/local/ocrd_all/venv/lib/python3.6/site-packages/PIL/Image.py", line 2828, in _decompression_bomb_check
"could be decompression bomb DOS attack." % (pixels, 2 * MAX_IMAGE_PIXELS)
PIL.Image.DecompressionBombError: Image size (833698320 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.
'
Command exited with non-zero status 1
But... 833 *10^6 Pixel? Not from the tiffs...
PS: pc_segmentation was not the problem since 0.2.1, but the processors after...
Has this issue been reported to the affected processor? If I understand correctly, pc_segmentation is no longer affected by this, right?
yes, case closed.
Error message:
ValueError: Error when checking input: expected input_1 to have shape (None, None, 1) but got array with shape (3177, 2350, 2)
Images (850 MB): https://digi.ub.uni-heidelberg.de/diglitData/v/testset-5-zeitschr-ca-1870.zip
Almost complete log:
Workflow: