Closed olivbrau closed 2 months ago
I cannot reproduce this problem locally. Can you upload a copy of the xlsx file that caused the error?
ExcelFileWithCRLF.xlsx Sure. Can you try with this one ? (don't open-save the file with excel, because excel automatically delete all the CRLF)
Hello,
Thanks a lot for the fix.
However, I'm wondering if this fix also the 2nd bug I mentioned : in the sharedStrings.xml, if there are CRLF after <si>
tag, the shared string read is false (there are no exception thrown however) : the string contains CRLF, because readUpTo()
reads every characters between <si>
and </si>
, including the CRLF before <t>
and after </t>
ex. if the string is written like this (like in the excel file I uploaded) :
`
I may not have noticed the second issue you mentioned, I will take the time to look into it. But probably not soon because currently I have some things to do outside of the coding world.
This should be fully fixed since #419 was merged,don't you think? @olivbrau
Hi, I'm trying to open an xslx file. I get this error : Current state not START_ELEMENT, END_ELEMENT or ENTITY_REFERENCE This error lies in
OPCPackage.extractFormat()
: In the while loop, the instructionif ("numFmt".equals(reader.getLocalName())) {
creates en error because the current token type of the reader is CHARACTERS, and thengetLocalName()
create an exception. It is because my xlsx file has new lines (CRL+LF) in the styles.xml and so, when we are in the loop that comes after getting the<cellXFS>
,insideCellXfs
is true, but we can't callgetLocalName()
since there are CHARACTERS (the new line). I hope I'm clear in spite of my bad english.To get this error, take a valid xlsx file, and put CR+LF after each tag in the styles.xml Excel doesn't put CRLF in the xml it creates, but my xlsx file comes from another tools which put theses CRLF. I think that xlsx format doesn't forbid this, so fastexcel should consider this possibility.
Another side effect of adding CRLF after each tag, is when fastexcel reads sharedStrings.xml : instead of keeping the text in
<t>
tags, it retrieves all the text between<si>
tags, so including all the CRLF. As a consequence, the strings returned are not the good one (but it doesn't crash the reading, compare to the styles.xml problem explained above).