roo-rb / roo

Roo provides an interface to spreadsheets of several sorts.
MIT License
2.78k stars 503 forks source link

Extra content (<html> tag) was added to cell value which looks like a HTML tag #514

Open cenxky opened 4 years ago

cenxky commented 4 years ago

I found roo gem always adding extra content to the cell value while reading the cells via loops.

Steps to reproduce

  1. Create an xlsx file, and set the A1 cell value to be <b>Hello Roo</b>
  2. Parse the file by Roo::Excelx.new(file_path, packed: false, file_warning: :ignore) and puts cell values via each block
  3. You will realize the output value is <html><b>Hell Roo</b></html> rather than origin text <b>Hello Roo</b>

Issue

This situation looks weird, I just found it will add <html> at present, but I am not sure whether it also add something else to lead the reading value is not origin value. Pls have a look, and if you need some help, I will be pleased to do. Thanks!

System configuration

Roo version: 2.8.2

Ruby version: ruby 2.5.3p105 (2018-10-18 revision 65156) [x86_64-darwin17]

Kiwi-x-Kiwi commented 4 years ago

We're experiencing a similar issue with .xlsx files on cells where there is bold formatting and it'll give us the following output <html><b> </b>[cell text]</html>.

In the above case, the cell text is not bold but the formatting on the cell itself is bold. There doesn't seem to be any problems when all the text in the cell is bold.

This problem only started occurring after we upgraded from Roo 1.11.1.

Roo version: 2.8.0 Ruby version: ruby 2.5.5

mario-amazing commented 1 year ago

You can fix it with Roo::Excelx.new(path, disable_html_wrapper: true) https://github.com/roo-rb/roo/pull/392