PhilterPaper / Perl-PDF-Builder

Extended version of the popular PDF::API2 Perl-based PDF library for creating, reading, and modifying PDF documents
https://www.catskilltech.com/FreeSW/product/PDF%2DBuilder/title/PDF%3A%3ABuilder/freeSW_full
Other
6 stars 7 forks source link

[RT 131223] corrupted PDF generated #111

Closed PhilterPaper closed 3 years ago

PhilterPaper commented 4 years ago

Mon Dec 23 14:20:51 2019 welleozean@googlemail.com - Ticket created [Reply] [Forward] Subject: corrupted PDF generated Date: Mon, 23 Dec 2019 20:23:27 +0100 To: bug-PDF-API2@rt.cpan.org From: welle ozean welleozean@googlemail.com

On Windows 10 running the latest PDF::API2 generates corrupted files:

use strict;
use warnings;
use PDF::API2;
use PDF::API2::Annotation;
use PDF::API2::Basic::PDF::Utils;

my $pdf = PDF::API2->open('C:\\Users\\WC\\Desktop\\original.pdf');
my $page = $pdf->openpage(1);

my $sticky = $page-> annotation;
$sticky-> text( 'Text in pop-up window',
    -rect => [ 100, 500, 100, 500 ], -open => 1 );
$sticky-> { C } = PDFArray( map PDFNum( $_ ), 1, 0.65, 0 );
$pdf->saveas( 'C:\\Users\\WC\\Desktop\\target.pdf' );

For what it matters, also simply opening the file and saveas without any operation in between generates a corrupted file. With corrupt I mean the latest Adobe reader is not able to open it (Error 14)

Mon Dec 23 19:34:32 2019 PMPERRY@cpan.org - Correspondence added

I just tried your code example, and it worked fine for me. The only change was to switch original.pdf to a local known-good PDF that I had lying around. By current PDF::API2, do you mean 2.036? Your original.pdf is known to be good (load into reader with no error messages, no offer to save it when quitting the reader)? I'm using Adobe Acrobat Reader DC (I think it lives in the Cloud) 19.021.20061, which I just updated yesterday, on Windows 10.

Anyway, do you still get this corruption with a variety of other PDFs?

PhilterPaper commented 4 years ago

Tue Dec 24 09:40:13 2019 welleozean@googlemail.com - Correspondence added

This are my spec: Windows 10 Perl 5.28.1 PDF::API2 2.036 Adobe Acrobad Reader DC 19.021.20061

All my PDF can be easily opened in Adobe with no error message. I extended my tests. All my files have been edited, probably using FoxyReader. All the files present the same issue after running my script (the original file, as said, can be opened with no issue). Other files downloaded from the Web for test reasons can be opened fine also after running the script. At this link, you can find a file that fails: https://filebin.net/2rp3p3xua17twwe1/making_sense_of_NMT.pdf?t=ureuhq16
making_sense_of_NMT.pdf

Tue Dec 24 12:40:32 2019 PMPERRY@cpan.org - Correspondence added

Two problems:

  1. Your PDF is version 1.5, which is likely to cause problems with PDF::API2. It may have structures or data that PDF::API2 has no idea how to handle.
  2. It starts at page 291 and runs to 309 (19 pages). I can't get to any page before 291. It looks like a complete article, but I've never seen this kind of behavior before.

I tried the same code and PDF file with PDF::Builder, and it seems to work (didn't blow up, at least). PDF::Builder is a little more forgiving of post-1.4 items, but not knowing what PDF::API2 is choking on, I can't guarantee that PDF::Builder is working properly. Anyway, you might want to try PDF::Builder (it can be installed alongside PDF::API2) and see if it works for you.

PhilterPaper commented 4 years ago

Fri Jan 03 08:15:53 2020 welleozean [...] googlemail.com - Correspondence added

Thank you for your feedback.

I was able to annotate the same PDF with PDF::Builder, so for this task on similar PDFs, I will use the suggested module.

Fri Jan 03 09:08:31 2020 PMPERRY [...] cpan.org - Correspondence added

It's good to hear that you have a way forward to do your work. It still would be nice to figure out what's going wrong with PDF::API2 so it could be fixed.

Something I didn't mention before is that PDF::Builder also had extensive rewrites of the Annotation functionality, so it's possible that the difference is in the Annotation code rather than in PDF 1.5+ handling.

PhilterPaper commented 4 years ago

Wed Feb 05 17:00:28 2020 steve [...] deefs.net - Correspondence added

Not having a test case (the filebin link no longer works), I'm going to guess from your description that the original PDF has a cross-reference stream in it. PDF::API2 can read those as of 2.026, but can't yet write them. See RT #117184.

You can work around the issue by creating a new PDF and importing the pages from the original file into the new one.

Wed Feb 05 17:00:29 2020 steve [...] deefs.net - Status changed from 'open' to 'stalled'

PhilterPaper commented 3 years ago

I'm going to go ahead a close (reject) this one, at there is no sign of trouble with PDF::Builder. There is nothing really happening on PDF::API2, either.