Open ppodgorskicrido opened 2 years ago
IMO htmlspecialchars should be removed: XMLWriter
According to XML guidelines &
should be escaped by using &
(also <
), but there is no need to escape "
and '
in text.
I do not understand the problem you are reporting. When you open the file in Excel, does the string in the cell not have the correct contents? Also, what harm is caused by the missing "count" attribute?
The problem is that Excel won't open a xlsx file, which contains "
or '
in xl/sharedStrings.xml.
What happend in my case? My app produce xlsx file (using this library). File won't be opened by Excel, so Excel shows that file can be repaired. After the repair, changes are in xl/sharedStrings.xml as below:
- <sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" uniqueCount="7372">
+ <sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" count="10925" uniqueCount="7372">
and a lot of changes like:
- <t xml:space="preserve">some text "quotationstext" more text</t>
+ <t xml:space="preserve">some text "quotationstext" more text</t>
- <t xml:space="preserve">some text qout's more text</t>
+ <t xml:space="preserve">some text qout's more text</t>
I don't know what "count" stands for and I hope you can help with that, but the problem with quotations are IMO pretty obvious.
My version of Excel does not have any problem opening a file I created as follows (based largely on the test case in your PR), and the contents of A1 are exactly as expected:
use PhpOffice\PhpSpreadsheet\Spreadsheet;
use PhpOffice\PhpSpreadsheet\Writer\Xlsx;
$spreadsheet = new Spreadsheet();
$sheet = $spreadsheet->getActiveSheet();
$sheet->setTitle('sheet1');
$text = "Text contains: single quote: ' ampersand: & double quote:" . ' " ' . 'less than: < greater than: >';
$sheet->setCellValueByColumnAndRow(1, 1, $text);
$sheet->setCellValue('A1', $text);
$writer = new Xlsx($spreadsheet);
$filename = 'issue.2884.xlsx';
$writer->save($filename);
echo "saved $filename\n";
If the same code results in an unreadable file for you, please upload it so that I can look into how it differs from my result.
Ok, you're rigth. I check both on Windows and Ubuntu and generated file is correct. I will send a file soon.
I striped out all unnecessary rows from the excel. This one row is causing problems
You have a problem because there is too much data in the cell. There are limits on the maximum number of characters that a cell can contain - see https://support.microsoft.com/en-us/office/excel-specifications-and-limits-1672b34d-7043-467e-8e27-269d656771c3?ui=en-us&rs=en-us&ad=us#ID0EBABAAA=Excel_2016-2013 You have two cells with a huge amount of data, although I think only J2 exceeds the limit. If I go into the xml and delete some lines from that cell, Excel is able to read the file.
Ok, thank you. I will make new PR tomorrow.
See discussion in PR #2913, which is now closed. The problem that is being reported is real, but it appears to be a problem with Excel, not PhpSpreadsheet.
This is:
I report this bug, because Excel (Office 365) shows that is need to repair xlsx file.
What is the expected behavior?
In xlsx file, file xl/sharedString.xml:
"
should be replaced by"
'
should be replaced by'
What is the current behavior?
In xlsx file, xl/sharedStrings.xml file:
"
and'
is not converted to"
and'
What are the steps to reproduce?
Xlsx file was created from database (UTF-8). Text was extracted from HTML. Text in databes contain
"
and'
, but somehow it is converted to"
and'
.What features do you think are causing the issue
Does an issue affect all spreadsheet file formats? If not, which formats are affected?
It could affect xls file.
Which versions of PhpSpreadsheet and PHP are affected?
PhpSpreadsheet: 1.23 PHP: 8.1.6 Ubuntu: 20.04