stephen-hardy / xlsx.js

XLSX.js is a JavaScript library for converting the data in base64 XLSX files into JavaScript objects - and back! Please note that this library is licensed under the Microsoft Office Extensible File License - a license NOT approved by the OSI. While this license is based off of the MS-PL, which is OSI-approved, there are significant differences.
http://blog.innovatejs.com/?tag=xlsx-js
Other
575 stars 122 forks source link

Wrong split of sharedString in <t xml:space="preserve">foo</t> #4

Closed ghost closed 11 years ago

ghost commented 11 years ago

I have a xlsx file containing the following XML in xl/sharedStrings.xml:

<si>
    <t xml:space="preserve">foo</t>
</si>

I have no idea for what the "preserve" tag is good for, but the sharedStrings are not split correctly:

s = zip.files['xl/sharedStrings.xml'].data.split('<t>');

I changed the code using a regex for the split:

//{ Process sharedStrings
    sharedStrings = [];
    s = zip.files['xl/sharedStrings.xml'].data.split(/<t.*?>/g);
    i = s.length;
    while(--i) { sharedStrings[i - 1] = s[i].substring(0, s[i].indexOf('</t>')); } // Do not process i === 0, because s[0] is the text before first t element
//}

Note that I also added a </t> to find the end of the string. This should work according to the following examples:

'<t xml:space="preserve">foo</t>'.split(/<t.*?>/g)
[ '', 'foo</t>' ]

'<t>foo</t>'.split(/<t.*?>/g)
[ '', 'foo</t>' ]

I changed this in my code and that was the fix for me. This seems to be an issue in both the 1.x and 2.0 version of xlsx.js.

stephen-hardy commented 11 years ago

Hello maitscha,

Thank you for telling us about the issue, and for contributing your fix! The "preserve" tag tells the application to preserve white space on either end of the string. It would mean the difference between " foo " and "foo". However, the tag can be present on an element representing a string without white space, without any harmful effect.

This fix will be incorporated into the next version of XLSX.js, which should be published soon.

Thanks, Stephen

stephen-hardy commented 11 years ago

Michl-me incorporated this fix into a pull request, which has now been merged. Thanks!

isharamahesh commented 9 years ago

Hi, i checked this option with your new files. I cannot find this option.