tfussell / xlnt

:bar_chart: Cross-platform user-friendly xlsx library for C++11+
Other
1.45k stars 406 forks source link

Exception thrown when loading an XLSX file containing defined names without `localSheetId` #685

Open doomlaur opened 1 year ago

doomlaur commented 1 year ago

I tested XLNT with some test files and noticed issues with the following file I downloaded from the internet: EBA Validation Rules March 2020 deactivation.xlsx

I got the following exception: xl/workbook.xml:2:2581: error: attribute 'localSheetId' expected

In xl\workbook.xml the following piece of XML causes issues:

<definedNames>
    <definedName name="_xlnm._FilterDatabase" localSheetId="9" hidden="1">v2.1!$A$2:$Z$2</definedName>
    <definedName name="_xlnm._FilterDatabase" localSheetId="8" hidden="1">v2.2!$A$1:$T$2679</definedName>
    <definedName name="_xlnm._FilterDatabase" localSheetId="7" hidden="1">'v2.3 (2.3.1)'!$A$1:$V$2864</definedName>
    <definedName name="_xlnm._FilterDatabase" localSheetId="6" hidden="1">'v2.3.2 (Finrep Ind Only)'!$A$2:$Z$2</definedName>
    <definedName name="_xlnm._FilterDatabase" localSheetId="5" hidden="1">'v2.4 (2.4.1)'!$A$1:$V$3169</definedName>
    <definedName name="_xlnm._FilterDatabase" localSheetId="4" hidden="1">v2.5!$A$2:$Z$3193</definedName>
    <definedName name="_xlnm._FilterDatabase" localSheetId="3" hidden="1">v2.6!$A$1:$W$3436</definedName>
    <definedName name="_xlnm._FilterDatabase" localSheetId="2" hidden="1">v2.7.0.1!$A$1:$Y$4528</definedName>
    <definedName name="_xlnm._FilterDatabase" localSheetId="1" hidden="1">'v2.8 (2.8.1)'!$A$1:$Z$5681</definedName>
    <definedName name="_xlnm._FilterDatabase" localSheetId="0" hidden="1">'v2.9 (2.9.1)'!$A$1:$Z$7130</definedName>
    <definedName name="LATESTRULESFORREF">#REF!</definedName>
    <definedName name="POTENTIALCHANGES">#REF!</definedName>
    <definedName name="POTENTIALFORMULAE">#REF!</definedName>
</definedNames>

The part that causes issue is that XLNT requires localSheetId to exist, while according to the official ECMA-376 that XML attribute is optional:

<xsd:attribute name="localSheetId" type="xsd:unsignedInt" use="optional"/>

To be precise, the following code in xlsx_consumer::read_office_document (which has been introduced in this pull request) causes issues:

else if (current_workbook_element == qn("workbook", "definedNames")) // CT_DefinedNames 0-1
{
    while (in_element(qn("workbook", "definedNames")))
    {
        expect_start_element(qn("spreadsheetml", "definedName"), xml::content::mixed);

        defined_name name;
        name.name = parser().attribute("name");
        name.sheet_id = parser().attribute<std::size_t>("localSheetId"); // !!! HERE is the issue !!!
        name.hidden = false;
        if (parser().attribute_present("hidden"))
        {
            name.hidden = is_true(parser().attribute("hidden"));
        }
        parser().attribute_map(); // skip remaining attributes
        name.value = read_text();
        defined_names_.push_back(name);

        expect_end_element(qn("spreadsheetml", "definedName"));
    }
}

Could you please fix this bug before releasing XLNT 1.6? Thank you for developing this great library! 😃

doomlaur commented 1 year ago

This issue would be fixed by merging #686

doomlaur commented 4 months ago

Just for the record: because I did not need support for the definedNames introduced in Pull Request #650, I was able to fix my issues some time ago by simply compiling XLNT from the last commit made before the pull request has been merged. To be precise, I compiled XLNT from commit f1107a5b0f32857e40d17ab4989009ad92377c1e made on August 13, 2022 at 22:21:48. So I simply did a checkout in Git exactly to that commit and compiled XLNT from it, excluding the commits after it. In case you don't need definedNames, you can do the same - otherwise, Pull Request #686 should fix the issues (but, to be honest, I did not test it myself yet).