tafia / calamine

A pure Rust Excel/OpenDocument SpreadSheets file reader: rust on metal sheets
MIT License
1.6k stars 155 forks source link

Ignoring Cell Formatting #424

Open TomPridham opened 2 months ago

TomPridham commented 2 months ago

is there a way to skip processing cells based on their excel defined format(e.g. Number)? we have users who upload excel files that are displayed one way in excel, but get converted to floats when they are processed through calamine. this screenshot shows the issue. the csv was generated by saving the xlsx file as a csv. the last two columns have the Number format applied and get converted to floats before getting converted back into strings. Screenshot 2024-04-15 at 3 45 21 PM

ideally, those values would get preserved as strings to avoid things like differences in float precision adding a bunch of extra digits or truncating trailing or leading 0s. is there a way to do that currently? i made a small repo showing the problem using the deserialize function to specify that the output should always be String

https://github.com/TomPridham/calamine-float

tafia commented 2 months ago

The value in the xlsx file really shows

<c r="E2" s="2"><v>7.0000000000000009</v></c><

Formatting is not properly supported in calamine unfortunately.

TomPridham commented 2 months ago

oh hmm. i didn't realize that excel would export the CSV based on what was displayed rather than what the actual value was. i guess i shouldn't be surprised excel does something unexpected. thanks for looking at this.

is formatting just something that hasn't been implemented yet or is there an argument against supporting it? if it hasn't been implemented, is this where you would recommend looking for prior art? https://git.sheetjs.com/sheetjs/sheetjs/src/branch/master/packages/ssf/ssf.js

tafia commented 2 months ago

This hasn't been implemented yet because I don't have the bandwidth for that. It is highly requested though.