Closed tinloaf closed 1 year ago
It would help to see the output of the command with the -v
verbose option enabled. The only code path I see which can lead to this error is when the original document has unreadable metadata. From that I'd assume that this only happens on some files. Is that right? It would also help to have an example document that fails with this exception.
The original file might be corrupt and have unreadable metadata, in which case the --gsFix
option could be tried on it. In any case, that exception should not be raised and I have pushed out version 1.1.13 of pdfCropMargins to fix at least that part.
Hi @abarker , thanks for getting back to me. This is the output of pdf-crop-margins -v
:
> pdf-crop-margins -v /tmp/in.pdf -o /tmp/out.pdf
Processing the PDF with pdfCropMargins (version 1.1.12)...
Python version: ('3', '10', '8')
System type: Linux
The input document's filename is:
/tmp/in.pdf
The output document's filename will be:
/tmp/out.pdf
The absolute pre-crops to be applied to each margin, in units of bp, are:
[0.0, 0.0, 0.0, 0.0]
The percentages of margins to retain are:
[10.0, 10.0, 10.0, 10.0]
The absolute offsets to be applied to each margin, in units of bp, are:
[0.0, 0.0, 0.0, 0.0]
The uniform order statistics to apply to each margin, in units of bp, are:
[]
For the full page size, using values from the PDF box
specified by the intersection of these boxes: ['m', 'c']
The input document has 1 pages.
No readable metadata in the document.
Caught an unexpected exception in the pdfCropMargins program.
Unexpected error: <class 'AttributeError'>
Error message : 'NoneType' object has no attribute 'producer'
File "/home/lba/.local/lib/python3.10/site-packages/pdfCropMargins/pdfCropMargins.py", line 59, in main
output_doc_pathname, exit_code, stdout_str, stderr_str = crop()
File "/home/lba/.local/lib/python3.10/site-packages/pdfCropMargins/pdfCropMargins.py", line 173, in crop
output_doc_pathname = main_crop(argv_list)
File "/home/lba/.local/lib/python3.10/site-packages/pdfCropMargins/main_pdfCropMargins.py", line 1574, in main_crop
bounding_box_list, delta_page_nums = process_pdf_file(input_doc_pathname,
File "/home/lba/.local/lib/python3.10/site-packages/pdfCropMargins/main_pdfCropMargins.py", line 1336, in process_pdf_file
metadata_info.producer)
So you seem to be right, this seems to be a problem with metadata. As far as I can tell, this happens with all PDF files created by the Rocketbook app. The --gsFix
option does solve the problem, thanks for the pointer!
I'm not sure whether you still consider this an error or whether this is exactly what you intended --gsFix
for. Thus I'll leave this ticket open for now, please just close it if you think this is sufficiently fixed. In case you want to investigate further, I have attached an example file: metadata_problem.pdf I can open this file in Evince and Acrobat Reader without them complaining. I don't know enough about the PDF standard to determine whether this is a valid PDF or whether there really is some corrupted data (that Acrobat and Evince just silently ignore).
The new pdfCropMargins version 1.1.13 works fine on my system to crop the example file now, so I'm closing the issue. Thanks for the bug report.
I'm on
pdfCropMargins
version 1.1.12, with these dependency versions:When I try to run it, I see this error:
pdf-crop-margins --version
seems to be about the only thing I can run that does not raise this error.Thanks for
pdfCropMargins
and please let me know if there is any more info you need.