tfussell / xlnt

:bar_chart: Cross-platform user-friendly xlsx library for C++11+
Other
1.49k stars 418 forks source link

Unknown attribute 'ref' in `parser::pop_element()` for certain files #514

Closed WvA1979 closed 2 years ago

WvA1979 commented 4 years ago

I am using XLNT to read an excel file and am experiencing an error upon just opening the file with load (xlnt::workbook load) ;

The error message is : xl/worksheets/sheet1.xml:2:2048: error: unexpected attribute 'ref'

which occurs when popping an element ;

The error is the result of the formula in cell Z2 (attached file T1b) ; However whenever I just manually reexecute (select the cell, press enter) the formula (resulting in file T1), the error message disappears and the file reads just fine ;

T1b.xlsx T1.xlsx

Of course I can not manually go through each formula in the real excel files I would like to parse, so any workaround would be welcome ;

PS:

Presumably this may be related to Issue #436 and/or #433 -- I did compile the latest version of xlnt, so I guess it is an error akin to the earlier ones, but not entirely identical

WvA1979 commented 4 years ago

The difference in the XML files behind it is :

T1 :
<c r="Z2"><f>H2*B2+O2*I2+V2*P2+10000*X2</f><v>144122.5</v></c>
T1b :
<c r="Z2"><f t="shared" ref="Z2" si="0">H2*B2+O2*I2+V2*P2+10000*X2</f><v>144122.5</v></c>
WvA1979 commented 4 years ago

I managed to bypass by changing xlsx_consumer.cpp lines to (added code with **)

xlnt::detail::Cell parse_cell(xlnt::row_t row_arg, xml::parser *parser)
{
    xlnt::detail::Cell c;
    for (auto &attr : parser->attribute_map())
    {
        if (string_equal(attr.first.name(), "r"))
        {
            c.ref = xlnt::detail::Cell_Reference(row_arg, attr.second.value);
        }
        else if (string_equal(attr.first.name(), "t"))
        {
            c.type = type_from_string(attr.second.value);
        }
        else if (string_equal(attr.first.name(), "s"))
        {
            c.style_index = static_cast<int>(strtol(attr.second.value.c_str(), nullptr, 10));
        }
        else if (string_equal(attr.first.name(), "ph"))
        {
            c.is_phonetic = is_true(attr.second.value);
        }
        else if (string_equal(attr.first.name(), "cm"))
        {
            c.cell_metatdata_idx = static_cast<int>(strtol(attr.second.value.c_str(), nullptr, 10));
        }
    }
    int level = 1; // nesting level
        // 1 == <c>
        // 2 == <v>/<f>
        // 3 == <is><t>
        // exit loop at </c>
    while (level > 0)
    {
        xml::parser::event_type e = parser->next();
        switch (e)
        {
        case xml::parser::start_element: {
            ++level;
            break;
        }
        case xml::parser::end_element: {
            --level;
            break;
        }
        case xml::parser::characters: {
            // only want the characters inside one of the nested tags
            // without this a lot of formatting whitespace can get added
            if (level == 2)
            {
                // <v> -> numeric values
                if (string_equal(parser->name(), "v"))
                {
                    c.value += std::move(parser->value());
                }
                // <f> formula
                else if (string_equal(parser->name(), "f"))
                {
                    c.formula_string += std::move(parser->value());
                    **// kick out any further attributes
                    for (const auto &attr : parser->attribute_map()){
                        (void)attr;                     
                    }**
                }
            }
            else if (level == 3)
            {
                // <is><t> -> inline string
                if (string_equal(parser->name(), "t"))
                {
                    c.value += std::move(parser->value());
                }
            }
            break;
        }
        case xml::parser::start_namespace_decl:
        case xml::parser::end_namespace_decl:
        case xml::parser::start_attribute:
        case xml::parser::end_attribute:
        case xml::parser::eof:
        default: {
            throw xlnt::exception("unexcpected XML parsing event");
        }
        }
    }
    return c;
}

With this I fall back on the "dragged formula" issue, wherein the xml code contains "formulas" of the kind (see T1c) T1c.xlsx

nschlia commented 3 years ago

The modification mentioned by WvA1979 does not fix the problem on my side. I found a report which exactly describes my problem here:

https://www.gitmemory.com/issue/tfussell/xlnt/436/592692138

It seems when you take a table that perfectly loads with XLNT, then duplicate a row or column by selecting more cells, the resulting spreadsheet will cause the exception to be raised. There's no other way to fix it but coping all other cells to a fresh, empty spread sheet. Then filling in the new fields manually or copying then one by one.

WvA1979 commented 3 years ago

Hi nschlia,

I had two issues. I managed to fix one, not the other. XLNT seems to work best with a formula free Excel file.

nschlia commented 3 years ago

Hi nschlia,

I had two issues. I managed to fix one, not the other. XLNT seems to work best with a formula free Excel file.

Formulas alone do not hurt, only when you duplicate them as described. Copying them with CTRL-C and V also works. The problem is you cannot disallow users to do the drag down trick to fill new rows in :)

nschlia commented 3 years ago

I have cloned the project and added a workaround, i.e, the offending attributes (and only those) are simply ignored. Probably these need no further handling and that could remain the final solution. Thats up to @tfussell. Maybe he wants to take my code as a starter to fix this issue. Anyway, the problem files can be loaded with these additions.

doomlaur commented 2 years ago

I can confirm that this issue is present in xlnt v1.5.0 (latest stable release). The fix from @nschlia worked very well, thank you! However, compiling xlnt from the latest commit (currently 3a279fc from 2021-08-22) fixed this issue, so it seems that no fix is needed anymore.

tfussell commented 2 years ago

Closing as a duplicate of https://github.com/tfussell/xlnt/issues/436. This should be resolved.