Open cmanley opened 10 years ago
Thumbs up for this "fix"!!!
This fix makes it work on my Mac too. I have the same issue with the original (21000+ rows) XSLX — saved from MS Office on a Windows machine — as I have with my 300 rows copy, saved as XSLX with LibreOffice on my Mac.
XMLReader::open(): Empty string supplied as input
So, on my Mac this fix solves XSLX-parsing...
I would like this fix included in this repo so that I can use this repo in a Laravel project I'm working on. Cloning this repo and inserting this fix isn't an option that I would prefer, I would like to receive future fixes etc. without having to maintain it synced.
Is this repo still active? (last commit seven months ago if I recall correctly?) Is there plans to fix existing bugs?
I see this seems to be fixed in 0.6.0-pre-alpha — I'm I correct? Is there any hopes for this fix coming into the master branch in any near future?
I was using cmanley:s "fix" as a ground for making both standard XLSX and LibreOffice saved ones both work with this library.
I actually did my own interpretation of when to use sheetId and when to use the rId-varaiable (attributes with prefix boolean set).
The logic that I presumed working when I elaborated with it was that LibreOffice saved XLSX:s has rId higher than sheetId, so in that case we select the lowest one (sheetId), if not we used the standard rId, so min($rId, $sheetId) did the trick...
I found it working with both LibreOffice saved XSLX:s and original Windows/MS Office/Excel saved ones when I did like this:
public function Sheets()
{
if ($this -> Sheets === false)
{
$this -> Sheets = array();
foreach ($this -> WorkbookXML -> sheets -> sheet as $Index => $Sheet)
{
$AttributesWithPrefix = $Sheet -> attributes('r', true);
$Attributes = $Sheet -> attributes();
$rId = 0;
$sheetId = 0;
foreach ($AttributesWithPrefix as $Name => $Value)
{
if ($Name == 'id')
{
$rId = (int)str_replace('rId', '', (string)$Value);
break;
}
}
foreach ($Attributes as $Name => $Value)
{
if ($Name == 'sheetId') {
$sheetId = (int)$Value;
break;
}
}
$this -> Sheets[min($rId, $sheetId)] = (string)$Sheet['name'];
}
ksort($this -> Sheets);
}
return array_values($this -> Sheets);
}
This is a working symbios for me — works on my machine... ;)
@jonascarlbaum you saved my day!! Why not send a PR with this change?
Great @sangar82
I see you created this pull request. Good. Hope they'll merge it in, if it proves solid in all cases. It works in production in the system I was working on at the time.
However, in all the Excel-files we parsed we were only interested in the first sheet, so I can't promise it is a success on a many sheets Excel-file. Guess it has to be tested...
Please check my fork out and let me know if all issues are solved: https://github.com/cmanley/spreadsheet-reader If they are solved, then I'll create a pull request. I've added test files and a phpunit test to my commit that you can also execute standalone.
This is the diff (whitespace turned off): https://github.com/cmanley/spreadsheet-reader/commit/c6696cffb9f7eecd19cf39063a2e25014294a7c4?w=1
I completely rewrote the sheets extraction part because the original code was doing it wrong, making assumptions about file name case and ids, and it wasn't using the xl/_rels/workbook.xml.rels file to locate the sheet file entry.
This is the commit message / list of changes:
Just wanted to comment that I was having issues with an XLSX file created with Libreoffice / Openoffice and the fork that @cmanley provided worked perfectly so if people are coming across issues with XLSX files I recommend trying this out and, ideally I'd love to see this code merged into the repo. Kudos to @cmanley ! 👍
There are still problems with XLSX files generated by Crystal Reports though that my fork hasn't solved. I haven't done more work on it because I tried the latest version of https://github.com/box/spout and that seems to work perfectly in all cases, and it's also fast.
Good to know! Unfortunately our app is a legacy web application that needs PHP 5.3 and Spout required 5.4+ :(. We are in the process of upgrading it to PHP 5.6 but in the meantime this library + your fix seems to work fine.
@cmanley is more fast, a lot more. I switched to spout because this project is dead
Any chance that the project will be updated now? It all seems dead to me.
Can someone approve the requested changes?
@cmanley thanks for the fix .
@cmanley Thank you for your "FIX"
It works here too on 7.2, Linux Mint.. so thanks.
Thank you for this! I was pulling my hair because of this bug.
This is my test script:
These are the warnings I get, and no row is printed:
My XLSX file contains 1 sheet with only the first 2 fields filled in. Perhaps it's related to LibreOffice, since I created the XLSX file with LibreOffice.
I debugged the problem using a lot of error_logging. It would make life easier if the code was written in a more paranoid style using exceptions. The problem was that SpreadsheetReader_XLSX was trying to open the non-existant sheet2.xml instead of sheet1.xml in the temporary directory.
I don't know much about the XLSX structure and why r:id was being parsed to get the sheet number, but this is my quick fix to SpreadsheetReader_XLSX.php that works for me (the 'fix' block is my fix, the next block is the original):