Closed horsley closed 2 years ago
Origin xml
<row r="3" spans="4:8">
<c r="D3" s="26" t="s"><v>2189</v></c>
<c r="E3" s="26" t="str"><f t="shared" si="0"/><v>sometext</v></c>
<c r="G3" s="26" t="s"><v>2186</v></c>
<c r="H3" s="26" t="s"><v>2190</v></c>
</row>
After truncateSheetXMLValueOnly
<row r="3" spans="4:8">
<c r="D3" s="26" t="s"><v>2189</v></c>
<v>sometext</v></c>
<c r="G3" s="26" t="s"><v>2186</v></c>
<c r="H3" s="26" t="s"><v>2190</v></c>
</row>
It seems that the bug was wrong regex in https://github.com/tealeg/xlsx/blob/840e9ee9b9adac9706540392005df460771b08cd/lib.go#L1154
It's matched with the tag f
regex can be change to
(?s)<c [^>]*?/[>|c>]
btw, func truncateSheetXMLValueOnly
was quite low efficiency because replace iteration by row and cell.
the logic was detecting rows or cells without value (v tag), why we just using regex to matched them, and replace all
It tasks 20 second on my mac to truncate a xml with 100,0000 rows
Got an error,
xml.Decoder.Decode: XML syntax error on line 2: element <row> closed by </c>
a file with dimension
<dimension ref="A1:I1038378"/>
many rows with empty cells.
I'll try to fix that