MathNya / umya-spreadsheet

A pure rust library for reading and writing spreadsheet files
MIT License
239 stars 41 forks source link

Reading Cells W/ Shared Strings #128

Closed IronicGambler closed 9 months ago

IronicGambler commented 10 months ago

Hi, this is a fantastic crate and I want to say thank you to all of the contributors.

I'm having one small issue when it comes to reading cells with a formula that references a shared string.

To reproduce the bug, i created this workbook: wb_with_shared_strings.xlsx its just a .xlsx file with two sheets. The first sheet is called "Sheet To Read From", the cells in range A1:A6 contain a formula which reference the second sheet ("Cells Referenced By Formulas"). The values within range A1:A6 of the "Cells Referenced By Formulas" sheet are ' Column A '- + -- 1 | Column Header 2 | 11 3 | 22 4 | ABCdef 5 | ABCdef 6 | ABCdef

and in the first sheet, (the "Sheet To Read From"), below are the formulas which reference the above range: ' Column A '- + -- 1 | Column Header 2 | ='Cells Referenced By Formulas'!A2 3 | ='Cells Referenced By Formulas'!A3 4 | ='Cells Referenced By Formulas'!A4 5 | ='Cells Referenced By Formulas'!A5 6 | ='Cells Referenced By Formulas'!A6

When I try to read this file, the numeric values work no problem, but the strings can't be read. I believe this is due to the cell contents being a shared string, I went through the source code, and was not able to figure the exact source of the problem (my apologies). But, if you use the .xlsx file i linked above with this code you will see the problem:

#[cfg(test)]
mod tests {
    use umya_spreadsheet;
    use umya_spreadsheet::Spreadsheet;

    const SHEET_NAME: &str = "Sheet To Read From";
    const A_RANDOM_STRING: &str = "ABCdef";

    #[test]
    fn read_wb_with_shared_strings() {
        let workbook: Spreadsheet = umya_spreadsheet::reader::xlsx::read("./wb_with_shared_strings.xlsx").unwrap();
        let sheet_to_read = workbook.get_sheet_by_name(SHEET_NAME).unwrap();

        // these numeric values work no problem
        // note: i used the `get_formatted_values` here, but same applies with `get_value` type fns
        let cell_a2 = sheet_to_read.get_formatted_value("A2");
        let cell_a3 = sheet_to_read.get_formatted_value("A3");
        assert_eq!(cell_a2, "11"); 
        assert_eq!(cell_a3, "22");

        // but the cells with shared strings are read as null values
        let cell_a4 = sheet_to_read.get_cell("A4").unwrap().get_raw_value();
        // to show this, this should fail, because cellA4's value should be "ABCdef"
        assert_eq!(cell_a4, &umya_spreadsheet::CellRawValue::Null);  // <-- this test does pass, but shouldn't

        // both cell A4 and A5's value are "ABCdef", so this test should pass, but instead fails
        let cell_a5 = sheet_to_read.get_formatted_value("A5");
        assert_eq!(cell_a5, "ABCdef"); // <-- this test will fail
    }
}

It is worth noting, the below test does work (in the second test I create a workbook with 1 sheet, save it, and read it)

    #[test]
    fn from_a_new_file_in_same_sheet() {

        // create a new workbook and populate it
        let mut workbook = umya_spreadsheet::new_file();
        let _ = workbook.new_sheet(SHEET_NAME);
        for row_num in 1..6 {
            let cell_addr = format!("A{row_num}");

            workbook.get_sheet_by_name_mut(SHEET_NAME).unwrap().get_cell_mut(cell_addr).set_value(A_RANDOM_STRING);
        }

        // save the file
        let fpath = "./a_new_wb.xlsx";
        let _ = umya_spreadsheet::writer::xlsx::write(&workbook, fpath);

        // reopen the file and read it's contents
        let newly_created_wb = umya_spreadsheet::reader::xlsx::read(fpath).unwrap();
        let sheet_to_read = newly_created_wb.get_sheet_by_name(SHEET_NAME).unwrap();

        let cell_a2 = sheet_to_read.get_formatted_value("A2");
        assert_eq!(cell_a2, A_RANDOM_STRING);
        let cell_a3 = sheet_to_read.get_formatted_value("A3");
        assert_eq!(cell_a3, A_RANDOM_STRING);
    }   

Please let me know if you have any questions, I'd be happy to provide more details!

MathNya commented 10 months ago

@IronicGambler Thank you for your report. We will investigate and make modifications to the issue at hand. If any questions arise, we will contact you. Thank you very much for your cooperation.

IronicGambler commented 10 months ago

Hey @MathNya thank you for getting back to me! I greatly appreciate all of your work on this amazing library (as well as everyone who has contributed)!

I wanted to provide an example that is entirely generated within the library (i.e. i create a wb from rust, save it, and then read it), in order to get the bug to replicate what I had to do was put the shared strings (in this case ABCdef) on one sheet.

Sheet Name: 'SheetWithValues' A
1 123.45
2 987.65
3 ABCdef
4 ABCdef
5 ABCdef

Above is the sheet that contains the hardcoded values. These hardcoded values are referenced on another sheet, as seen below

Sheet Name: 'Read This Sheet' A
1 =SheetWithValues!A1
2 =SheetWithValues!A2
3 =SheetWithValues!A3
4 =SheetWithValues!A4
5 =SheetWithValues!A5

In order to reproduce the exact same bug, what I had to do was run

  1. Create a function to write to a .xlsx file with the two sheets as described above
  2. Run that function (see: wrtite_wb_with_two_sheets) and then open the file up in Excel and save it
  3. Then run the Rust test write_and_read_shared_strings_on_different_sheets

I am not sure why I had to open up the file in excel, save it, and then run the test, but that was needed to get the exact same error that I had in the first post.


#[cfg(test)]
mod tests {
    use umya_spreadsheet;

    // here are some constants used throghout
    pub const name_of_formulas_sheet: &str = "Read This Sheet";
    pub const name_of_values_sheet: &str = "SheetWithValues";
    pub const a_random_string: &str = "ABCdef";
    pub const fpath: &str = "./shared_strings_two_sheets_write_read.xlsx";

    /// NOTE: please run this function, open up the generated .xlsx file in Excel, save it, and then run 
    /// the test at the bottom of this code block
    /// This function is here not to test the writing functionality of the library, instead i wanted to have a
    /// generate the file within Rust so you could see how the file was made
    fn write_wb_with_two_sheets() {

        // create a new workbook and add one sheet to it
        let mut workbook = umya_spreadsheet::new_file();
        let _ = workbook.new_sheet(name_of_values_sheet);
        let _ = workbook.new_sheet(name_of_formulas_sheet);

        // populate A1:A5 of the sheet with values
        let cell_a1_addr = "A1";
        workbook.get_sheet_by_name_mut(name_of_values_sheet).unwrap().get_cell_mut(cell_a1_addr).set_value_number(123.45);
        let cell_a2_addr = "A2";
        workbook.get_sheet_by_name_mut(name_of_values_sheet).unwrap().get_cell_mut(cell_a2_addr).set_value_number(987.65);
        for row_num in 3..6 {
            let cell_addr = format!("A{row_num}");
            workbook.get_sheet_by_name_mut(name_of_values_sheet).unwrap().get_cell_mut(cell_addr).set_value(a_random_string);
        }

        // populate A1:A5 of a different sheet with formulas that reference the sheet with values
        for row_num in 1..6 {
            let cell_addr = format!("A{row_num}");
            let cell_formula = format!("={name_of_values_sheet}!A{row_num}");
            workbook.get_sheet_by_name_mut(name_of_formulas_sheet).unwrap().get_cell_mut(cell_addr).set_formula(cell_formula);
        }

        // save the file
        let _ = umya_spreadsheet::writer::xlsx::write(&workbook, fpath);
    }

    #[test]
    fn write_and_read_shared_strings_on_different_sheets() {

        // reopen the file and read it's contents
        let newly_created_wb = umya_spreadsheet::reader::xlsx::read(fpath).unwrap();
        let sheet_to_read: &umya_spreadsheet::Worksheet = newly_created_wb.get_sheet_by_name(name_of_formulas_sheet).unwrap();

        let cell_a2 = sheet_to_read.get_value("A2");
        println!("Cell A2: {:?}", cell_a2);
        assert_eq!(cell_a2, String::from("987.65"));

        let cell_a3 = sheet_to_read.get_formatted_value("A3");
        println!("Cell A3: {:?}", cell_a3); // <-- should be 'ABCdef' instead reads in "" - this is the error i'm having
        assert_eq!(cell_a3, a_random_string);

        let cell_a4 = sheet_to_read.get_formatted_value("A3");
        println!("Cell A4: {:?}", cell_a4);
        assert_eq!(cell_a4, a_random_string);
    }
}

Running this test with

cargo test -- --nocapture

to print out the results yields

running 1 test
Cell A2: "987.65"
Cell A3: ""
thread 'tests::write_and_read_shared_strings_on_different_sheets' panicked at 'assertion failed: `(left == right)`
  left: `""`,
 right: `"ABCdef"`', src\lib.rs:175:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test tests::write_and_read_shared_strings_on_different_sheets ... FAILED

failures:

failures:
    tests::write_and_read_shared_strings_on_different_sheets

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.03s

NOTE: I had to run the function to generate the .xlsx file (see: wrtite_wb_with_two_sheets), open it up in excel, save it, and then run this test to produce this output.

Worth noting, this test does pass:

#[cfg(test)]
mod tests {
    use umya_spreadsheet;

    const SHEET_NAME: &str = "Sheet To Read From";
    const A_RANDOM_STRING: &str = "ABCdef";
    #[test]
    fn write_and_read_shared_strings_same_sheet() {

        // create a new workbook and add one sheet to it
        let mut workbook = umya_spreadsheet::new_file();
        let _ = workbook.new_sheet(SHEET_NAME);

        /* |   |  A   |   B  |
         * |---|------|------|
         * | 1 |123.45|=A1   |
         * | 2 |987.65|=A2   |
         * | 3 |ABCdef|=A3   |
         * | 4 |ABCdef|=A4   |
         * | 5 |ABCdef|=A5   |
         */

        // populate column A
        let cell_a1_addr = "A1";
        workbook.get_sheet_by_name_mut(SHEET_NAME).unwrap().get_cell_mut(cell_a1_addr).set_value_number(123.45);
        let cell_a2_addr = "A2";
        workbook.get_sheet_by_name_mut(SHEET_NAME).unwrap().get_cell_mut(cell_a2_addr).set_value_number(987.65);
        for row_num in 3..6 {
            let cell_addr = format!("A{row_num}");
            println!("On Cell: {cell_addr}");
            workbook.get_sheet_by_name_mut(SHEET_NAME).unwrap().get_cell_mut(cell_addr).set_value(A_RANDOM_STRING);
        }

        // populate column B
        for row_num in 1..5 {
            let cell_addr = format!("B{row_num}");
            let cell_formula = format!("=A{row_num}");
            workbook.get_sheet_by_name_mut(SHEET_NAME).unwrap().get_cell_mut(cell_addr).set_formula(cell_formula);
        }

        // save the file
        let one_sheet_fpath = "./shared_strings_write_read.xlsx";
        let _ = umya_spreadsheet::writer::xlsx::write(&workbook, fpath);

        // reopen the file and read it's contents
        let newly_created_wb = umya_spreadsheet::reader::xlsx::read(one_sheet_fpath).unwrap();
        let sheet_to_read = newly_created_wb.get_sheet_by_name(SHEET_NAME).unwrap();

        let cell_a2 = sheet_to_read.get_value("A2");
        println!("Cell A2: {:?}", cell_a2);
        assert_eq!(cell_a2, String::from("987.65"));

        let cell_a3 = sheet_to_read.get_formatted_value("A3");
        println!("Cell A3: {:?}", cell_a3);
        assert_eq!(cell_a3, A_RANDOM_STRING);
        let cell_a4 = sheet_to_read.get_formatted_value("A3");
        println!("Cell A4: {:?}", cell_a4);
        assert_eq!(cell_a4, A_RANDOM_STRING);
    }
}

I believe, this because the shared strings that are referenced by the formula are within the same sheet as the formulas themselves.

MathNya commented 10 months ago

@IronicGambler Thank you for the additional report. I think I have a rough idea of the cause. I will check it out this weekend!

MathNya commented 10 months ago

@IronicGambler The program has been modified. Please check it.

IronicGambler commented 10 months ago

Hi,

Thank you, i went over it and the library worked- thank you for all of your help, you are the best!