google-code-export / pdfium

Automatically exported from code.google.com/p/pdfium
1 stars 0 forks source link

COS Stream parsing sometimes fails if \Length value is wrong #57

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
== What steps will reproduce the problem?
1. Render Hello_broken.pdf using pdfium. Note that there is no text.
2. Open Hello_broken.pdf using Adobe. Note that there is text "Hello, word". 
You may get a dialogue warning the PDF is corrupt.
(Hello.pdf is a non-corrupt version of the same PDF)

== What is the expected output? What do you see instead?
- Expect pdfium to recover if length value is wrong, as it sometimes does, and 
show the text as Adobe does.

== What version of the product are you using? On what operating system?
pdfium on any branch after 2085, including master branch. OS doesn't matter.

== Please provide any additional information below.

The bug is in core/src/fpdfapi/fpdf_parser/fpdf_parser_parser.cpp
in CPDF_SyntaxParser::ReadStream

Large values of \Length cause the stream parsing code to give up immediately 
and return NULL, whereas missing or otherwise incorrect values of \Length are 
recovered from.

Google employees can see two additional pieces of information -
-  Change 78141191 shows a patch I made to a copy of pdfium I maintain, which 
fixes the problem - it tries to keep the change minor, you may or may not want 
to duplicate it.
-  Bug 17492449 contains another PDF file (a real-world example) that is 
affected by this bug - a font file isn't loaded which causes the text to render 
as gibberish.

Original issue reported on code.google.com by ol...@google.com on 21 Oct 2014 at 3:25

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by pal...@chromium.org on 21 Oct 2014 at 9:49

GoogleCodeExporter commented 9 years ago
@olsen - prior to landing our proposed path, I tried opening hello_broken.pdf 
with the unpatched code, and I see Hello World on the page.  Could you update 
the example so that it fails?  If so, making an automated test for this becomes 
possible via the text page API and counting characters.

Original comment by tsepez@chromium.org on 27 Jan 2015 at 7:52

GoogleCodeExporter commented 9 years ago
This may have already been corrected at 
https://codereview.chromium.org/743263002

Original comment by tsepez@chromium.org on 27 Jan 2015 at 7:57

GoogleCodeExporter commented 9 years ago
Indeed.  Parallel effort tracked in chromium tracker.  Sorry about the 
confusion.

Original comment by tsepez@chromium.org on 27 Jan 2015 at 8:01