dhatim / fastexcel

Generate and read big Excel files quickly
Other
677 stars 122 forks source link

Fix resolving of implicit number format codes #362

Closed ezand closed 10 months ago

ezand commented 10 months ago

This PR is trying to address the issue https://github.com/dhatim/fastexcel/issues/328.

The OpenXml spec contains some implicit number format codes that is not explicitly saved in the file. This causes the current functionality to be unable to resolve the format for certain number format ids: https://learn.microsoft.com/en-us/dotnet/api/documentformat.openxml.spreadsheet.numberingformat?view=openxml-2.8.1

This means a styles.xml file can contain something like this without having numFmt elements at all:

<cellXfs count="4">
    <xf numFmtId="22" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/>
    <xf numFmtId="14" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/>
</cellXfs>

In these cases we now check the implicit ids and are able to resolve the format codes.

I know this library tries to be very selective of what formatting to include for performance reasons, but IMO this is important piece of information to know when reading the Excel file. The simple map-lookup shouldn't be much of a performance overhead, and it will only be performed when withCellFormat=true is specified in ReadingOptions.

I haven't quite familiarised myself with all the coding standards and conventions in this library, so not sure if the map-constant should be moved out to a separate class or what, please let me know 😊

ochedru commented 10 months ago

Nice PR! Thank you.