Closed yatrik-cloud closed 1 year ago
Hi, thank you for the post. Can you please share your source file? This bug may be avoided trying lower resolution in images. Please try "-r 200" flag and lets see what happens.
Hi, thank you for the post. Can you please share your source file? This bug may be avoided trying lower resolution in images. Please try "-r 200" flag and lets see what happens.
Yes, Great! "-r 200" is working Thank you so much for your quick response.
While applying OCR to a PDF, using the docker image of the repo "leofcardoso/pdf2pdfocr:latest", this error occurred:
[2023-09-05 10:35:58.939733] [LOG] Welcome to pdf2pdfocr version 1.12.0 marapurense - https://github.com/LeoFCardoso/pdf2pdfocr [2023-09-05 10:35:58.959460] [LOG] Input file /home/docker/Dummy_IS.pdf: type is application/pdf [2023-09-05 10:35:59.047502] [LOG] Converting input file to images... [2023-09-05 10:36:38.577186] [LOG] Checking blank pages multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/usr/lib/python3.10/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, **kwds)) File "/usr/lib/python3.10/multiprocessing/pool.py", line 51, in starmapstar return list(itertools.starmap(args[0], args[1])) File "/usr/local/bin/pdf2pdfocr.py", line 249, in do_check_img_colors_size im = Image.open(param_image_file) File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3172, in open im = _open_core(fp, filename, prefix, formats) File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3159, in _open_core _decompression_bomb_check(im.size) File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3068, in _decompression_bomb_check raise DecompressionBombError( PIL.Image.DecompressionBombError: Image size (235978454 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack. """
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/usr/local/bin/pdf2pdfocr.py", line 1530, in
pdf2ocr.ocr()
File "/usr/local/bin/pdf2pdfocr.py", line 712, in ocr
self.check_blank_pages(image_file_list)
File "/usr/local/bin/pdf2pdfocr.py", line 1010, in check_blank_pages
blank_map_values = colors_size_pool_map.get()
File "/usr/lib/python3.10/multiprocessing/pool.py", line 774, in get
raise self._value
PIL.Image.DecompressionBombError: Image size (235978454 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.