sestegra / spreadsheet_decoder

Spreadsheet Decoder is a library for decoding spreadsheets for ODS and XLSX files.
MIT License
55 stars 21 forks source link

Cell value contains phonetic information. #43

Closed okaxaki closed 1 year ago

okaxaki commented 1 year ago

I'm afraid that this may be an issue only for languages in the East Asian region... SpreadsheetDecoder seems to merge phonetic information (under <rPh> tags in xl/sharedStrings.xml) into cell value. Phonetic information is auxiliary hint so I think it does not need to be observable as a cell value.

Currently, XlsDecoder._parseSharedString is:

  void _parseSharedString(XmlElement node) {
    var list = [];
    node.findAllElements('t').forEach((child) {
      list.add(_parseValue(child));
    });
    _sharedStrings.add(list.join(''));
  }

Shouldn't it be the following?

  void _parseSharedString(XmlElement node) {
    var list = [];
    node.findAllElements('t').forEach((child) {
      if (child.parentElement.name.local != 'rPh') {
        list.add(_parseValue(child));
      }
    });
    _sharedStrings.add(list.join(''));
  }

I attach an example excel file with phonetic information. with_phonetic.xlsx.zip

sestegra commented 1 year ago

Thank you for sharing. I'm busy on other projects right now. Feel free to create a PR and create related unit tests.

okaxaki commented 1 year ago

Thanks. I have created a PR #44.

sestegra commented 1 year ago

Fixed on 2.2.0 release