tafia / calamine

A pure Rust Excel/OpenDocument SpreadSheets file reader: rust on metal sheets
MIT License
1.6k stars 155 forks source link

Reading rich text formatted cells #440

Open FlixCoder opened 4 weeks ago

FlixCoder commented 4 weeks ago

Thank you for the great library! :)

Somewhat related to #424, #404 #427, but after all quite different: I want to read Excel sheets with cells that have individually formatted values, but not on the whole cell, but only part of the text. So it is not the cell style, but the rich text value inside the cell that interests me.

Example in shared strings (cell value displayed as: "1, 1, 2"):

        <si>
        <r>
            <t xml:space="preserve">1, </t>
        </r>
        <r>
            <rPr>
                <u />
                <sz val="8" />
                <rFont val="Arial" />
                <family val="2" />
            </rPr>
            <t>1</t>
        </r>
        <r>
            <rPr>
                <sz val="8" />
                <rFont val="Arial" />
                <family val="2" />
            </rPr>
            <t>, 2</t>
        </r>
    </si>

I am in the unfortunate situation that the formatting decides the meaning of the content 😅

Previous art / other implementations: https://docs.rs/edit-xlsx/0.4.4/edit_xlsx/struct.RichText.html : https://docs.rs/edit-xlsx/0.4.4/src/edit_xlsx/api/cell/rich_text.rs.html#5-7 as well as https://docs.rs/umya-spreadsheet/latest/umya_spreadsheet/structs/struct.RichText.html

Depending on the scope and agreement on the implementation, I might be interested in contributing this soon. I suppose it would need another data type of cell values and a similar RichText implementation? Not sure how parsing exactly works and how it looks like in ODS..

tafia commented 4 weeks ago

Thanks for opening the issue and stating it as clearly. I'll be more than happy to review any PR :)