tafia / calamine

A pure Rust Excel/OpenDocument SpreadSheets file reader: rust on metal sheets
MIT License
1.61k stars 155 forks source link

Xls error: Unrecognized error: 0x3 #343

Closed 7Towers closed 10 months ago

7Towers commented 11 months ago

When attempting to open a workbook, I'm getting the title error.

Xls error: Unrecognized error: 0x3

Edit Can someone provide me some tips to debug? The sheet has some financial specifics, and customer names, so it's not easily shared. I may be able to create some dummy data, but would like to start with some tips to troubleshoot on my end, with the help of you pros. I was able to strip the sheet of all data, and I'm still having trouble opening it with this library. Sample sheet can be found here

The sheet is saved as an .xls. If I open the sheet in Excel and re-save as xlsx, then attempt to open the workbook in calamine, I still receive the same error.

Nothing fancy on the opening code:

    let mut workbook = match open_workbook_auto(path) {
        Ok(wb) => wb,
        Err(e) => {
            eprintln!("Error opening workbook: {}", e);
            return Err(e);
        }
    };

That code yields

Error opening workbook: Xls error: Unrecognized error: 0x3

When I attempt to open these sheets using umya spreadsheets (another rust xls lib), I get a little descriptive error:

ZipError: invalid Zip archive: Could not find central directory end

I am able to open the sheet in Excel, and with other frameworks like xlsx js (SheetJS), but that's a bit slower.

There are many refs in this workbook to other workbooks in the same directory, as well as heavy use of macros and other content that I'm not familiar with, yet.

dimastbk commented 11 months ago

Hi!

Error opening workbook: Xls error: Unrecognized error: 0x3

It seems calamine can't handle "0x03 Blank string value." in FormulaValue.

Could you attach a sample in GitHub? I can't download the original from Google Sheets.

7Towers commented 11 months ago

Thanks @dimastbk.

Here's the file That file won't trigger the error, sorry.

7Towers commented 11 months ago

Update:

Stepped through the calamine source in a fork. I triggered the error at: https://github.com/tafia/calamine/blob/3a5966f8bae69b368ddcc1c6abe01cb2166ec4c2/src/xls.rs#L1308

I can load my file if I put this hack in:

fn parse_formula_value(r: &[u8]) -> Result<Option<DataType>, XlsError> {
    match r {
        &[0x00, .., 0xFF, 0xFF] => Ok(None), // String, value should be in next record
        &[0x01, _, b, .., 0xFF, 0xFF] => Ok(Some(DataType::Bool(b != 0))),
        &[0x02, _, e, .., 0xFF, 0xFF] => parse_err(e).map(Some),
        &[0x3, _, .., 0xFF, 0xFF] => {
            println!("Avoid 0x03 formula value");
            Ok(None)
        }, // empty
        &[e, .., 0xFF, 0xFF] => Err(XlsError::Unrecognized {
            typ: "error",
            val: e,
        }),
        _ => Ok(Some(DataType::Float(read_f64(r)))),
    }
}

Specifically, the

       &[0x3, _, .., 0xFF, 0xFF] => {
            println!("Avoid 0x03 formula value");
            Ok(None)
        }, // empty

I don't know much about xls parsing, but I see in the attached spec you referenced from MS, that the 0x03 is a blank string value, and must be ignored.

image

If that's the case here, perhaps safely ignoring it is logical? I'll let you confirm. If it's as simple as that, I'm happy to do a PR. Else, I'll let you advise a smarter way to handle it.

tafia commented 11 months ago

Seems like a fix to me. Please open a PR. Thanks

jqnatividad commented 11 months ago

@7Towers I can't reproduce this error with master using your sample file.

Is this still an issue?

7Towers commented 11 months ago

@jqnatividad no, that file will not repro the issue. I edited that comment some days ago to clarify. It is an issue with the files my customer has given me, but I can't share their spreadsheets. I'll make a PR to merge the fix I've been using.

7Towers commented 11 months ago

Made a PR for this: #348