Closed JulioPablo closed 3 years ago
I will take a look. The handling of empty cells and boolean values by Excel is maddening, wildly inconsistent from function to function. I have been concentrating on parameter values of null and booleans in my refactoring of MathTrig and DateTime, but have not looked at null strings. Based on this report, I imagine that those might currently be rejected by PhpSpreadsheet but accepted by Excel. If so, I could get to that after the MathTrig refactoring is complete - the code changes wouldn't be that hard, but adding the test cases ...
I've just verified and created a series of failing unit tests to confirm whether the fix will work or not
I think Excel has some more inconsistencies to worry about. Fill a cell with '=15+""', and you get #VALUE!, not 15. This seems like it ought to give the same result as the reported problem, but it doesn't. At least the result is consistent for '=15+FALSE' (15).
Fill a cell with '=15+""', and you get #VALUE!
I have just confirmed this as well. Guess it only considers the empty string a 0 if it comes from another cell and not literally defined in the formula.
As if this weren't already enough of a pain - LibreOffice and Gnumeric yield #VALUE! whether the null string is used from a cell or directly in a formula (hooray for sensible results!). OpenOffice treats it as 0 in both cases (lesser hooray for consistent results!). I am not aware of other cases where LibreOffice and OpenOffice disagree.
How exactly did you get null string into your cell? At least some of the following ways on an Excel spreadsheet seem to work as I would have expected, even when referenced as cells (i.e. VALUE):
=""
This shows as <c r="B10" t="str"><f>""</f><v/></c>
in the XML fileLEFT(someothercell, 0)
This shows as <c r="A2" t="str"><f>LEFT(A1,0)</f><v/></c>
in XML<c r="B1" s="1"/>
in XML file<c r="A3" t="s"><v>1</v></c>
For all of these cells except your original, using them in a formula causes VALUE in Excel. Only your original defies expectations. Gnumeric actually seems to agree with Excel for all of these, except possibly the last, which it appears not to interpret correctly. LibreOffice agrees with Excel, except that it treats the copy-paste special exactly as the original.
I'm actually not sure, this comes from a report from another department on the organization I work at. I'm supposed to import it. But I'm not sure how they managed to add that empty string, I do know the report is made manually using MS Excel, it isn't generated by another application.
Sorry I can't provide any more useful details on the origin of that null string.
When I use PhpSpreadsheet to read your spreadsheet, it shows the value of cell B1 as null, not a null-string. That would certainly explain some of the results I'm seeing. It also shows the correct calculated value for C1, namely 13.46. So I'm not convinced that this is a PhpSpreadsheet calculation problem. There definitely could be a problem with how it's reading in the value in B1. I am still trying to figure out what it's XML value really says about it.
That's interesting, when I read it with PhpSpreadsheet I get ""
for B1
Okay, let's compare code:
$reader = new ReaderXlsx();
$file = 'issue1977.xlsx';
$spreadsheet = $reader->load($file);
echo "Loaded $file\n";
$sheet = $spreadsheet->getActiveSheet();
$val = $sheet->getCell('B1')->getValue();
var_dump($val);
$val2 = $sheet->getCell('C1')->getCalculatedValue();
var_dump($val2);
Loaded issue1977.xlsx
C:\git\issue1977.php:16:
NULL
C:\git\issue1977.php:18:
double(13.46)
The XML says cell B1 has a style (s="1"), but it doesn't have a value. That is consistent with its being null, not a null string. Is it possible that this has changed between the real file you're trying to import and the truncated version you posted?
I'm pretty sure the cell value is the same between the original and truncated file.
But you're right about your behavior, I managed to replicate it on my end
I was using the PhpOffice\PhpSpreadsheet\Cell\StringValueBinder
value binder :/
I do believe the behavior is still erroneous, because B2 was returning null and not a string.
But definitely would've been ideal if I had mentioned the value binder to being with...
Thanks for the update. It is a relief to know that the problem isn't quite so widespread as it appeared to be at first. It's late here, and I'm not particularly familiar with the value binder. I may have a chance to look again tomorrow with this fresh information in mind, but it may not happen till the weekend.
Summarizing the discussion that has gone before, this problem seems not to be with Calculation, but rather with StringValueBinder. I can see at least 3 possible changes that might be made to PhpSpreadsheet to address this:
I will have to defer to @MarkBaker to decide which, if any, of those possibilities should be implemented. I do not believe that any other changes to PhpSpreadsheet are warranted based on this report. In the meantime, you have 2 possible workarounds:
if ($value === null) {
$cell->setValueExplicit(null, DataType::TYPE_NULL);
return true;
}
This investigation also seems to have uncovered bugs in Gnumeric (handling copied-and-pasted-special cell), and OpenOffice (which handles null string as 0 unlike any of the others, and, in particular, unlike LibreOffice). If I can figure out how to file bug reports for those, I will do so. (I suppose PhpSpreadsheet could emulate the OpenOffice behavior by adding a new compatibility mode for Calculation; that seems like it would require a lot of effort for very little gain.)
Is it okay to close this ticket now?
Well, up to you guys, you're the collaborators.
I guess it is technically not fixed yet? As far as I know, anyway. So up to you how you'd like to handle it.
I have made a couple of suggestions on how you might get your program working (default binder or roll-your-own) whether or not we make any changes. If either of those suggestions works for you, then I will close the ticket. If neither works, tell me why, and I'll try to find an approach that does work for you.
Hey, yeah I actually had already got it working on my project by cloning some of PHPSpreadsheet's code. So on that end I'm good and I deeply appreciate the willingness to help me find a approach that works 👍
I just opened the bug report because I thought it might be of interest for you folks, but again, I leave the actual issue handling up to you, as far as I'm concerned I've got this up and running on my end, not sure if you guys are interested in just documenting the work arounds & findings discussed in this thread or applying a fix at the library level and so on.
PR #2138 was just merged. It documents the workings of StringValueBinder, and also adds some new settable properties which make it easy to meet your expectations, e.g. by specifying that nulls should be left as-is.
This is:
What is the expected behavior?
The cell holding an empty string value
''
\""
is treated as a0
What is the current behavior?
The cell holding the empty string is considered invalid and when the formula cell value is attempted to be calculated
#VALUE!
is returned.What are the steps to reproduce?
If you want to reproduce this, try to get the calculated values of the cells by calling
getCalculatedValue()
. You can also open it on your favorite calc program and check the behavior there and how it does not match. Formula_Test.xlsxFor clarity:
The first row does not work in PHPSpreadsheet The second row does work in PHPSpreadsheet
Both rows work in Excel/Calc
Proposed solution
The culprit of this behavior is this function
https://github.com/PHPOffice/PhpSpreadsheet/blob/master/src/PhpSpreadsheet/Calculation/Calculation.php#L4767-L4803
My proposal is modifying the following
if
statementhttps://github.com/PHPOffice/PhpSpreadsheet/blob/master/src/PhpSpreadsheet/Calculation/Calculation.php#L4777-L4799
To look as follows
More in detail, It would check if the string value is empty and if it is, set
0
as the operand value, this is done in the following lineFurthermore, the following empty string check wouldn't be necessary.
https://github.com/PHPOffice/PhpSpreadsheet/blob/17af13281b7f10294a5480704b684c5a793a5a0c/src/PhpSpreadsheet/Calculation/Calculation.php#L4786
and would simply be replaced for
@MarkBaker pinging you again, I saw you're mentioned in the blame of this file as well and wanted your input on this proposed solution. I'm also not sure if this would suffice for the behavior to match Excel's on all cases, or if this only cover Numeric Binary Operations and other situations would have to also be taken into account, so just wanted to raise that point as well.
Which versions of PhpSpreadsheet and PHP are affected? I tested this on 1.16 (via Laravel Excel) but the problematic code, which I linked here is clearly still present on the latest version.