ssimms / pdfapi2

Create, modify, and examine PDF files in Perl
Other
15 stars 21 forks source link

To fix the openning and import of some pdf file #28

Open xazzzz opened 4 years ago

xazzzz commented 4 years ago

I have a bunch of PDF file that only contain a drawing in a mediabox in a one file. Those PDF are generated with SpaceClaim and Catia. Other PDF software open them, but not pdfapi2.

Trying to open them first produce this error: Can't call method "val" on an undefined value at /usr/share/perl5/PDF/API2.pm line 909. Thus adding the defined in API2.pm

Them when I try to import the first page, I got: Can't locate object method "find_prop" via package "PDF::API2::Basic::PDF::Dict" at /usr/share/perl5/PDF/API2/Basic/PDF/Pages.pm line 272.

So I add an exact copy of find_prop of Pages.pm to Dict.pm And it solve my problem.

File look like: %PDF-1.4 %<95><95><95>æ 1 0 obj <</CreationDate(D:20171218132646-05'00') /Creator(SpaceClaim) /Producer(SpaceClaim)>> endobj 2 0 obj <</Pages 3 0 R>> endobj 3 0 obj <</Count 1 /Kids[4 0 R]>> endobj 4 0 obj <</MediaBox[0 0 1224 792] /Parent 3 0 R /Contents 5 0 R /Group <</CS/DeviceRGB /S/Transparency /I false /K false>> /Resources <</ProcSet [/PDF/Text/ImageB/ImageC/ImageI] /ExtGState <</GS0 6 0 R /GS1 7 0 R>> /XObject <</I0 9 0 R>>>>>> endobj 5 0 obj <</Length 2665694>> stream

coveralls commented 4 years ago

Coverage Status

Coverage decreased (-0.04%) to 57.54% when pulling c813375071deab8a297cc8a6d1493f747be0a5fc on xazzzz:master into 4cb8fced63f3efe73046af1539cba3f2fa11400f on ssimms:master.

PhilterPaper commented 4 years ago

What version of PDF::API2 is this using? I think this may have been fixed a couple releases ago (2.037), if it's the same bug as RT 130722 or RT 131147. It sounds quite familiar. Can you confirm that it's neither of those bugs? The first is still open, but the second was patched.

xazzzz commented 4 years ago

It's definitively not RT131147 since I've tried it with 2.037. It might look like RT130722, but only for the second part since to get to this point, I had to apply the first fix. Since the page root is a page, when it look at the kids of the root, there are no kids that define 'Type'

PhilterPaper commented 4 years ago

Another thought... is this possibly the same as PR #21?

xazzzz commented 4 years ago

I had 2 problems, the first fix in API2.pm allow me to start processing the file, but then I had the cannot find find_prop in Pages.pm which make me to the second change.

PR #21 look a lot like the second part of the fix that I did, but I still need the first one to open file

draxil commented 4 years ago

The other issue (Can't call method "val" on an undefined value at /usr/share/perl5/PDF/API2.pm line 909.) does bear the similarity to PR #21 in that: PDF::API2 is a little more intolerant to "bad PDFs" than other tools.

A lot of the discussion in trying to accept PR #21 centered around whether becoming more tolerant of bad structure would be desirable or not. Looking at this other issue it does seem like a similar patch making the code a little more defensive could be added, but the same debate would probably ensue?

On Mon, 5 Oct 2020 at 14:50, xazzzz notifications@github.com wrote:

I had 2 problems, the first fix in API2.pm allow me to start processing the file, but then I had the cannot find find_prop in Pages.pm which make me to the second change.

PR #21 https://github.com/ssimms/pdfapi2/pull/21 look a lot like the second part of the fix that I did, but I still need the first one to open file

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ssimms/pdfapi2/pull/28#issuecomment-703645292, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAK6MRTG7QKGRWD4XDYPA3SJHFJPANCNFSM4RFS6UZA .

PhilterPaper commented 4 years ago

A lot of the discussion in trying to accept PR #21 centered around whether becoming more tolerant of bad structure would be desirable or not.

My philosophy in PDF::Builder has been to fix up bad structure (read-in PDF file) wherever it can be unambiguously done, but issue a warning message (i.e., never silently patch it). A user needs to know that they have a defective PDF on their hands, for whatever reason, even if it's produced by a well-established commercial product that happens to not follow the PDF standards.

draxil commented 4 years ago

Makes a lot of sense!

On Mon, 5 Oct 2020 at 16:00, Phil Perry notifications@github.com wrote:

A lot of the discussion in trying to accept PR #21 https://github.com/ssimms/pdfapi2/pull/21 centered around whether becoming more tolerant of bad structure would be desirable or not.

My philosophy in PDF::Builder has been to fix up bad structure (read-in PDF file) wherever it can be unambiguously done, but issue a warning message (i.e., never silently patch it). A user needs to know that they have a defective PDF on their hands, for whatever reason, even if it's produced by a well-established commercial product that happens to not follow the PDF standards.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ssimms/pdfapi2/pull/28#issuecomment-703689256, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAK6MVSPAJTX54NCAF2QL3SJHNQVANCNFSM4RFS6UZA .