wwxxyx / pdfium

Automatically exported from code.google.com/p/pdfium
0 stars 0 forks source link

Potential crash in fpdf_text_int.cpp:CPDF_TextPage::ProcessMarkedContent #67

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
I've been unable to pull the page out of the PDF that repros this, and I can't 
attach the PDF to this bug as it is private data. I trimmed the page stream 
down to the following smallest content that repros the issue.

1.Create a new PDF
2.Set the page stream to have this content:
/Part <<
/MCID 0
>> 
BDC /OC BMC 0 1 0.66 0.35 k /GS0 gs 29.173 719.172 624.001 42.001 re f 0 0 0 0 
k 17.173 17.173 564 600 re f EMC 

3. Load into PDFIUM

What is the expected output? What do you see instead?
Expected: No crash when loading the text page.
Actual: Crash when loading the text page for a page.

What version of the product are you using? On what operating system?
Mac OS X, Linux and Windows

Please provide any additional information below.

In the named function there is this loop:
    for (n = 0; n < nContentMark; n++) {
        CPDF_ContentMarkItem& item = pMarkData->GetItem(n);
        CFX_ByteString tagStr = (CFX_ByteString)item.GetName();
        pDict = (CPDF_Dictionary*)item.GetParam();
        CPDF_String* temp = (CPDF_String*)pDict->GetElement(FX_BSTRC("ActualText"));
        if (temp) {
            actText = temp->GetUnicodeText();
        }
    }

It the type of the CPDF_ContentMarkItem is None, then the dictionary you get 
from item.GetParam() is NULL and all sorts of bad things happen.

I believe this will fix the issue:

    for (n = 0; n < nContentMark; n++) {
        CPDF_ContentMarkItem& item = pMarkData->GetItem(n);
+        if (item.GetParamType() == CPDF_ContentMarkItem::ParamType::None) 
continue;
        CFX_ByteString tagStr = (CFX_ByteString)item.GetName();
        pDict = (CPDF_Dictionary*)item.GetParam();
        CPDF_String* temp = (CPDF_String*)pDict->GetElement(FX_BSTRC("ActualText"));
        if (temp) {
            actText = temp->GetUnicodeText();
        }
    }

Original issue reported on code.google.com by darkdesc...@gmail.com on 29 Oct 2014 at 7:34