PHPOffice / PhpSpreadsheet

A pure PHP library for reading and writing spreadsheet files
https://phpspreadsheet.readthedocs.io
MIT License
13.33k stars 3.46k forks source link

Calculation Error - Default to 0 when supplying empty IF result? #2146

Closed amenk closed 9 months ago

amenk commented 3 years ago

This is:

- [x] a bug report
- [x] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

We have quite complex excel sheet which causes an error in calculation. We were able to track down the problem to the example below.

It seems to have something to do with leaving some IF branches empty - but =IF(0<9,1,) works. Especially the calculation works in Libre Office and Excel, so it would be nice if there is a fix.

What is the expected behavior?

Return of 1

What is the current behavior?

PHP Fatal error:  Uncaught PhpOffice\PhpSpreadsheet\Calculation\Exception: Worksheet!A1 -> internal error in projects/mcve/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Cell/Cell.php:275
Stack trace:
#0 projects/mcve/mcve.php(7): PhpOffice\PhpSpreadsheet\Cell\Cell->getCalculatedValue()
#1 {main}
  thrown in projects/mcve/vendor/phpoffice/phpspreadsheet/src/PhpSpreadsheet/Cell/Cell.php on line 275

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();
$spreadsheet->getActiveSheet()->setCellValue('A1','=IF(0<9,1,)+IF(0<9,,IF(0=0,,IF(0<9,1,0)))');
echo $spreadsheet->getActiveSheet()->getCell('A1')->getCalculatedValue();

It does seem to work with the expression

 $spreadsheet->getActiveSheet()->setCellValue('A1','=IF(0<9,1,)+IF(0<9,0,IF(0=0,0,IF(0<9,1,0)))');

Even more minmal failing example (okay, that means the one before was not minimal :-))

$spreadsheet->getActiveSheet()->setCellValue('A1','=IF(0<9,,IF(0=0,,1))');

Which versions of PhpSpreadsheet and PHP are affected?

1.18.0

MarkBaker commented 3 years ago

There's already a PR in the pipeline to change the behaviour of the calculation engine for passing empty values.

However, there is also a known issue with empty arguments in the lazy evaluation of IF() functions Issue #2002... if that is causing the problem for you, then you can disable lazy evaluation before executing any calculation, or saving the spreadsheet using:

    $calculationEngine = Calculation::getInstance($spreadSheet);
    $calculationEngine->disableBranchPruning();

until we have a resolution for the lazy evaluation issue.

amenk commented 3 years ago

Thanks @MarkBaker for the feedback & this awesome package.

I can confirm that the disableBranchPruning() works on the above example.

In the real excel sheet (were the formula is even more compelx) I am getting #VALUE! results when trying it to fix that way. Anyways, I was able to fix the excel sheet it self by just inserting the missing "0" values in the IF() and all works fine (I first was afraid there are tons of formulas to fix, but it seems to be only one)

I guess it's best to wait and test after the PR is merged.

amenk commented 2 years ago

We ran into issues where disableBranchPruning() lead to wrong results, we think those are related to IF(x;y;) statements, but are not sure yet.

MarkBaker commented 2 years ago

If disableBranchPruning() isn't working, then it suggests a much more fundamental problem with empty values in if statements; that's going to require some investigation, and some specific examples to show where it fails, and I'm not sure how much time I have for that. I'm currently on sick leave from work, but when I can I'm wrestling with implementing "structured references" in the calculation engine to provide full support for Excel tables

amenk commented 2 years ago

@MarkBaker I was a bit unclear. disableBranchPruning() stops Exceptions from happening when we have those partial-IF clauses, but then some values in a very complex sheet are not updated properly. If I don't use disableBranchPruning, it throws an error.

But: We just fixed the IF clauses and wrote out the default value.Now I don't even need disableBranchPruning anymore and I do get correct results. I also get correct results with disabled branch pruning, but I assume not that it is better to leave if enabled, so to detect potential problems more early.

So for now we have a viable workaround.

MarkBaker commented 2 years ago

It's generally better to leave branch pruning enabled for performance reasons; but it can't determine whether an if clause has a null value or is missing a value, so it doesn't know whether it should populate that value or not. When branch pruning is disabled, it can correctly identify whether the argument is missing or not, and populate it with the default. Those defaults for if are:

amenk commented 2 years ago

Okay, so I understand that miscalculation we were facing is probably something else.