Open DiagonalArg opened 1 year ago
Symlinking jbig2 to jbig2enc, also produces errors:
$ scans2pdf -v Feigon-001-000.crop_2R.tif scans2.pdf
DEBUG:Using selector: EpollSelector
DEBUG:Running command: ['convert', '-colorspace', 'sRGB', '-profile', '/home/dev/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/argyllcms-srgb.icm', '-background', '#ffffff', '-alpha', 'remove', '-alpha', 'off', '-type', 'TrueColor', '/home/user/Lee.finished/Feigon-Images/out/Feigon-001-000.crop_2R.tif', '/var/tmp/djpdf-kb6b1v1n/image.png']
DEBUG:convert-im6.q16: profile 'icc': 'RGB ': RGB color space not permitted on grayscale PNG `/var/tmp/djpdf-kb6b1v1n/image.png' @ warning/png.c/MagickPNGWarningHandler/1668.
DEBUG:Running command: ['convert', '-fill', '#000000', '-opaque', '#000000', '-fill', '#000000', '-opaque', '#000000', '-threshold', '0', '/var/tmp/djpdf-kb6b1v1n/image.png', '/var/tmp/djpdf-3yk3pn39/image.png']
DEBUG:Running command: ['identify', '-units', 'PixelsPerInch', '-format', '%x %y', '/var/tmp/djpdf-kb6b1v1n/image.png']
DEBUG:Running command: ['convert', '-fill', '#ffffff', '-opaque', '#000000', '-resize', '50%', '/var/tmp/djpdf-kb6b1v1n/image.png', '/var/tmp/djpdf-ftx7rfk1/image.png']
DEBUG:Running command: ['identify', '-format', '%w %h', '/var/tmp/djpdf-kb6b1v1n/image.png']
DEBUG:Running command: ['convert', '-format', '%c', '/var/tmp/djpdf-3yk3pn39/image.png', 'histogram:info:-']
DEBUG:Running command: ['convert', '-format', '%c', '/var/tmp/djpdf-ftx7rfk1/image.png', 'histogram:info:-']
DEBUG:Running command: ['convert', '-fill', '#000000', '-opaque', '#000000', '-fill', '#000000', '-opaque', '#000000', '-threshold', '0', '/var/tmp/djpdf-kb6b1v1n/image.png', '/var/tmp/djpdf-amjl3335/image.png']
DEBUG:Running command: ['tesseract', '-l', 'eng', '--dpi', '600', '/var/tmp/djpdf-amjl3335/image.png', '/var/tmp/djpdf-9ap6ewya/ocr', 'hocr']
DEBUG:Tesseract Open Source OCR Engine v4.1.1 with Leptonica
INFO:Can't extract textangle from ocr_line: bbox 716 941 826 987; baseline 0 0; x_size 61; x_descenders 15.25; x_ascenders 15.25
DEBUG:Exception occurred:
Traceback (most recent call last):
File "/home/dev/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/hocr.py", line 46, in extract_text
textangle = textangle_regex.search(line.attrib["title"]).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
INFO:Can't extract textangle from ocr_line: bbox 643 1005 898 1043; baseline -0.008 -6; x_size 37; x_descenders 7; x_ascenders 8
DEBUG:Exception occurred:
Traceback (most recent call last):
File "/home/dev/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/hocr.py", line 46, in extract_text
textangle = textangle_regex.search(line.attrib["title"]).group(1)
AttributeError: 'NoneType' object has no attribute 'group'
WARNING:Lossy JBIG2 compression can alter text in a way that is not noticeable as corruption (e.g. the numbers '6' and '8' get replaced)
DEBUG:Running command: ['convert', '-alpha', 'remove', '-alpha', 'off', '-colorspace', 'gray', '-threshold', '50%', '/var/tmp/djpdf-3yk3pn39/image.png', '/var/tmp/djpdf-lykzi68v/input.0.png']
DEBUG:Running command: ['jbig2', '-p', '-s', '-t', '0.9', '/var/tmp/djpdf-lykzi68v/input.0.png']
DEBUG:Error in fopenReadStream: file not found
Unable to open "/var/tmp/djpdf-lykzi68v/input.0.png"
ERROR:Command '['jbig2', '-p', '-s', '-t', '0.9', '/var/tmp/djpdf-lykzi68v/input.0.png']' returned non-zero exit status 1
DEBUG:Exception occurred:
Traceback (most recent call last):
File "/home/dev/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/scans2pdfcli.py", line 392, in main
asyncio.run(build_pdf(pages, out_file))
File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
return future.result()
File "/home/user/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/scans2pdf.py", line 589, in build_pdf
return await pdf_builder.write(
File "/home/user/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/djpdf.py", line 867, in write
await asyncio.gather(
File "/home/user/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/djpdf.py", line 734, in make_page
await asyncio.gather(
File "/home/user/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/djpdf.py", line 576, in pdf_image
return await self._image.pdf_image(psem)
File "/home/user/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/djpdf.py", line 433, in pdf_image
return await self._cache.get(self._pdf_image(psem))
File "/home/user/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/util.py", line 121, in get
self._content = await content_future
File "/home/user/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/djpdf.py", line 514, in _pdf_image
(jbig2_images, jbig2_globals), image_masks = await asyncio.gather(
File "/home/user/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/djpdf.py", line 496, in get_jbig2_images
await run_command(cmd, psem, cwd=temp_dir)
File "/home/user/.local/pipx/venvs/djpdf/lib/python3.10/site-packages/djpdf/util.py", line 169, in run_command
raise CalledProcessError(proc.returncode, args, None)
subprocess.CalledProcessError: Command '['jbig2', '-p', '-s', '-t', '0.9', '/var/tmp/djpdf-lykzi68v/input.0.png']' returned non-zero exit status 1.
CRITICAL:Operation failed
$ jbig2 -p -s -t 0.9 ~/Constitution.png
JBIG2 compression complete. pages:1 symbols:127 log2:7
$ echo $?
0
Unable to open "/var/tmp/djpdf-lykzi68v/input.0.png"
Temporary files are stored in /var/tmp/
. My guess is that Snap jbig2 doesn't have access to this folder on the host or some similar issue related to sandboxing.
Looks like a nice tool, thanks. I'm running
scans2pdf
on the output ofscantailor-advanced
. Some exceptions occurred, so here is the output of running it on the first page. I can provide the image if it would be useful.Note that ubuntu 22.04 has a snap providing
jbig2enc
, while you're looking forjbig2
.