PHPOffice / PHPExcel

ARCHIVED
Other
11.46k stars 4.19k forks source link

Excel 2007 Reader freezes because of conditional formatting #575

Closed thomvaill closed 8 years ago

thomvaill commented 9 years ago

PHPExcel version: 1.8 PHP version: 5.3.10-1ubuntu3 with Suhosin-Patch

Code causing the issue:

$reader = new PHPExcel_Reader_Excel2007();
$source = $reader->load('test.xslx');

The Excel file causing the issue contains the following conditional formatting: conditionals-bug-example

Details: When I load this Excel file (in non-read-only mode), the script takes 100% CPU and the memory load increases radically. The load() method of the Reader never ends. I tried to debug a bit, and at first sight, it seems that the slowness comes frome here :

// PHPExcel/Reader/Excel2007.php
// Class PHPExcel_Reader_Excel2007
// Method load()
// Lines 959 - 987

foreach ($conditionals as $ref => $cfRules) {
    ksort($cfRules);
    $conditionalStyles = array();
    foreach ($cfRules as $cfRule) {
        $objConditional = new PHPExcel_Style_Conditional();
        $objConditional->setConditionType((string)$cfRule["type"]);
        $objConditional->setOperatorType((string)$cfRule["operator"]);

        if ((string)$cfRule["text"] != '') {
            $objConditional->setText((string)$cfRule["text"]);
        }

        if (count($cfRule->formula) > 1) {
            foreach ($cfRule->formula as $formula) {
                $objConditional->addCondition((string)$formula);
            }
        } else {
            $objConditional->addCondition((string)$cfRule->formula);
        }
        $objConditional->setStyle(clone $dxfs[intval($cfRule["dxfId"])]);
        $conditionalStyles[] = $objConditional;
    }

    // Extract all cell references in $ref
    $aReferences = PHPExcel_Cell::extractAllCellReferencesInRange($ref);
    foreach ($aReferences as $reference) {
        $docSheet->getStyle($reference)->setConditionalStyles($conditionalStyles);
    }
}

In particular this line : $aReferences = PHPExcel_Cell::extractAllCellReferencesInRange($ref); is really slow. Its first execution is fine, but the second never ends (when called with $ref = 'A1:J1048576').

Has anyone already experienced this issue? Is it a PHPExcel bug, or does it come from my conditionals formatting, which are over the whole sheet?

MarkBaker commented 9 years ago

If you consider that the line $aReferences = PHPExcel_Cell::extractAllCellReferencesInRange($ref); is building an array of 10 (A-J) * 1048576 (1-1048576) cell addresses (a total of 10,485,760 entries), then it is going to take a while and use a lot of memory doing so.

While combining the lines:

    $aReferences = PHPExcel_Cell::extractAllCellReferencesInRange($ref);
    foreach ($aReferences as $reference) {
        $docSheet->getStyle($reference)->setConditionalStyles($conditionalStyles);
    }

as

    foreach (PHPExcel_Cell::extractAllCellReferencesInRange($ref) as $reference) {
        $docSheet->getStyle($reference)->setConditionalStyles($conditionalStyles);
    }

would improve things a bit, it's not going to be fast building that list and then iterating over that many cells, applying the conditionals to each in turn

Being able to use a Generator for PHPExcel_Cell::extractAllCellReferencesInRange() would help further, but that requires a minimum PHP version of 5.5.0

Vitexus commented 9 years ago

17 rows and 18 columns xslx file causes:

Fatal error: Allowed memory size of 4244635648 bytes exhausted (tried to allocate 234881040 bytes) in (...)/classes/PHPExcel/Cell.php on line 909

$returnValue contain 14680064 items and array_unique can't handle it.

dmeijboom commented 8 years ago

If you only want to read the data you can use the following snippet (for the excel reader):

$objReader = PHPExcel_IOFactory::createReader("Excel2007");
$objReader->setReadDataOnly(true);

This skips the formatting options and loads a lot faster! Since it skips formatting it doesn't freeze anymore.

rentalhost commented 8 years ago

It's a big problem here.

Why it happen/is need?

rentalhost commented 8 years ago

I found a solution. Just replace (file Classes/PHPExcel/Reader/Excel2007.php, line 985):

$aReferences = PHPExcel_Cell::extractAllCellReferencesInRange($ref);
foreach ($aReferences as $reference) {
    $docSheet->getStyle($ref)->setConditionalStyles($reference);
}

By:

$docSheet->getStyle($ref)->setConditionalStyles($conditionalStyles);

I don't know why the current method is used, because in this case, the conditional styles should be copied only. In current version, if you have a selected area with 100 cells, for instance, by with an unique conditional formatting, the result file will have 100 different definitions to this same conditional formatting.