PhilterPaper / Perl-PDF-Builder

Extended version of the popular PDF::API2 Perl-based PDF library for creating, reading, and modifying PDF documents
https://www.catskilltech.com/FreeSW/product/PDF%2DBuilder/title/PDF%3A%3ABuilder/freeSW_full
Other
6 stars 7 forks source link

GH 87 - adding an annotation destroyes the PDF file #87

Closed danisowa closed 5 years ago

danisowa commented 6 years ago

Hello, the following code "destroyes" my PDF documents.

I am getting erros when trying to print the created PDF and when trying to save the created PDF (for fast web view)

i have drilled down the problem to the lines

                $annotation = $page->annotation();
                $annotation->url("http://www.google.de", %option);
$position_y = 20;
for $cu_page_number (1 .. $pdf->pages())
        {
            $page = $pdf->openpage($cu_page_number);
            %option=(
                -border => [0,0,0],
                -rect=>[50,
                            $position_y + 40,
                    10,
                    $position_y,
                    ],
                   );
                $annotation = $page->annotation();
                $annotation->url("http://www.google.de", %option);
        }   
        $pdf->saveas('/tmp/test.pdf');
PhilterPaper commented 6 years ago

I'll try to reproduce it later today. Are you using the latest PDF::Builder release (3.009), or an earlier one? When you say it "destroys" the PDF, you're talking about a bad output PDF (/tmp/test.pdf) and not damage to the original input PDF (under a different name, I presume)? Or are you saving it back into the original?

PhilterPaper commented 6 years ago

Here is a complete example, using PDF::Builder 3.009 that appears to work OK. Each of the 12 pages (012_pages example is the input) gets a live region (marked with Google) that opens www.google.de successfully. How does your complete program differ? I am writing out a different file -- I don't think you can rewrite the input file (saveas()), if that's what you did. I haven't tried a save().

use strict;
use warnings;
use PDF::Builder;

my $pdf = PDF::Builder->open('012_pages.pdf', -compress => 'none');
                  # the default compression 'flate' also works
print "there are ".($pdf->pages())." pages in the input file\n";

# =============================
my $position_y = 20;  # add "my" to all locals
for my $cu_page_number (1 .. $pdf->pages()) {
    print "modifying page $cu_page_number\n";
    my $page = $pdf->openpage($cu_page_number);
    my %option=(
        '-border' => [0,0,0],    # -border doesn't seem to work anywhere
        '-rect'   => [50,
                      $position_y + 40,
                      10,
                      $position_y,
                     ],
    );
    # url() does not permit an icon, so write "Google" there
    my $txt = $page->text();
    my $font = $pdf->corefont("Helvetica");
    $txt->font($font, 20);
    $txt->translate(10, $position_y);
    $txt->text("Google");

    my $annotation = $page->annotation();
    $annotation->url("http://www.google.de", %option);
}
$pdf->saveas('test.pdf');   # was /tmp/test.pdf
# =============================

Maybe there's something in your original PDF (source) that doesn't like annotations or even (in general) being updated?

danisowa commented 6 years ago

Hello, thanks for your investigation i've tried your code and it is nor working for some of my pdfs. Not all PDFs are destroyed. To your questions, yes i'm opening a pdf file from another location and i should not overwrite the original pdf for this reason i'm using the saveas method. i have also tried copiing the original and then using the update method and i have the same behaviour. Special Characters are destoryed and it is not possible to save it with acrobat (using file --> save)

I will attach an example PDF later this day and a screenshot "before and after"

danisowa commented 6 years ago

Here is a sample document and a screenshot before and after input.pdf 2018-06-20_06h51_19

PhilterPaper commented 6 years ago

I've been playing with it this evening, and I found a few things of interest. First of all, I don't think the annotation calls have anything to do with it. I removed your annotation lines (and my "Google" print) so that the output should basically be just the input, and it still has the corruption.

Anyway, if it's possible to start with a PDF 1.4 level document, we could see if the annotation works on that (it does for me). That would suggest that there's something in your input.pdf document that is confusing PDF::Builder, and code to handle it might need to be added. Some PDF writers just slap "1.7" on the header, even if they're producing lower levels, but I suspect there really is something greater than 1.4 here.

PhilterPaper commented 6 years ago

Hi, Did you have a chance to try starting with a PDF 1.4 document, and perhaps seeing if specific 1.5+ elements break PDF::Builder? That would at least give a starting point for any further work in this area. As I said, I don't think it's the annotation itself, but something in the original PDF 1.7 document that isn't being handled well.

PhilterPaper commented 5 years ago

I believe that the issue here is the importation of PDF > 1.4 files, and not annotation. As the issue originator has not responded in over 6 months, I will go ahead and close this issue. I can use the sample input for work on handling PDF > 1.4 files (see #27 ([RT 106020] Bug with recognizing PDF files via open_scalar), #39 ([RT 117210] Error: "can't call method "realise" on an undefined value" while open a pdf-file), #65 ([RT 121832] Invalid PDF file in test suite), #66 ([RT 121911] Adding pages to existing documents doesn't work if page tree is anything but very simple (+ maybe fixed)), and #90 (Problem extracting pages from PDF v. 1.6 documents)).