tfussell / xlnt

:bar_chart: Cross-platform user-friendly xlsx library for C++11+
Other
1.43k stars 402 forks source link

Multiple exceptions when `count` does not match the number of elements #735

Open doomlaur opened 1 month ago

doomlaur commented 1 month ago

I noticed that xlsx_consumer.cpp throws exceptions in multiple cases when count does not match the number of elements. This causes issues because:

  1. According to ECMA-376, in each of these cases, the "count" attribute is optional:
    <xsd:attribute name="count" type="xsd:unsignedInt" use="optional"/>
  2. I noticed issues when trying to open files exported by SAP software, where the exporter sets count="1" but the actual count is 2. Of course this is wrong and is actually an issue of the SAP software - however, Microsoft Excel is still able to open the file correctly. For users of our software, it looks like our software is broken because it cannot open the file (while Excel can), since users do not know that the SAP software is actually the broken one :sweat_smile: On the other hand, while it is not nice to accept obviously broken files, there is no actual disadvantage for XLNT to simply ignore the fact that the count does not match.

My proposed solution:

  1. Check whether the "count" attribute exists - if it does, read it.
  2. If the count is known, call std::vector::reserve to allocate enough memory right away (small performance improvement, because why not?).
  3. Ignore exceptions due to invalid count by default. However, if THROW_ON_INVALID_XML is defined and the count is known, throw an exception, just like before. In other words, if a very strict XML parser is desired, exceptions can still be enabled.

I will send a pull request containing the proposed changes.

doomlaur commented 1 month ago

Pull Request #736 fixes the issues I had.

mankaixin commented 1 month ago

a new question in this restoration,When you insert data, if it's a number, no problem, if it's a string then there's a problem, it's going to be a different character than what you inserted. I found that the tag number corresponding to sheeting.xml is incorrect.

mankaixin commented 1 month ago

if excel is size don't match ,i write some word to excel ,but excel word is error .i found error is here

//default allow_duplicates false
std::size_t workbook::add_shared_string(const rich_text &shared, bool allow_duplicates)
{
    register_workbook_part(relationship_type::shared_string_table);

    if (!allow_duplicates)
    {
        auto it = d_->shared_strings_ids_.find(shared);

        if (it != d_->shared_strings_ids_.end())
        {
            return it->second;
        }
    }

    auto sz = d_->shared_strings_ids_.size();

    d_->shared_strings_ids_[shared] = sz;
    d_->shared_strings_values_.push_back(shared);
    //modify by 
    return d_->shared_strings_values_.size() - 1;
    //return sz;
}

in size don't match we don't return sz isn't

doomlaur commented 1 month ago

@mankaixin Could you please share an XLSX file where I can test your issue?

mankaixin commented 1 month ago

@mankaixin Could you please share an XLSX file where I can test your issue?

test2.xlsx I write some data to sheet PadName in cells c2 through c4

I wanted to write the effect as shown below image

, but the result is as shown below image

doomlaur commented 1 month ago

@mankaixin Thanks for the example. The issue I described above is about reading data, while the issue you described is about writing data. Since your issue is a different one than mine, so please open a new GitHub issue. Furthermore, you could also create a Pull Request containing your fix.

I quickly looked at the code and I agree - if allow_duplicates is true and duplicates are added, the unordered_map shared_strings_ids_ and the vector shared_strings_values_ will get out of sync and the returned index will be wrong, so your proposed fix should fix the issue - although I must admit that I did not check whether there are any other similar issues related to duplicate strings. But this is something we can discuss in a new GitHub issue.