pythonicrubyist / creek

Ruby library for parsing large Excel files.
http://rubygems.org/gems/creek
MIT License
386 stars 109 forks source link

Data corruption while reading forumla cells. #116

Open dannypurcell opened 1 year ago

dannypurcell commented 1 year ago

Previously mentioned in https://github.com/pythonicrubyist/creek/issues/89

Although Creek is not intended to do any calculations or handling of formulas in cells, it is unfortunately corrupting the data in that type of cell as it's being read.

Cells with type Formula and content such as =HYPERLINK('https://some/link/or/other') get turned into number types with 0 as the content. This makes it unusable even just for copying or merging files.

Expected behavior is just that the type and content of the cells are not changed on reading. Even if that means the data for formula cells has to be represented as binary data blobs in Ruby, that would be preferable to having the data changed to 0.

Example: Given an input file with content

number,name,link
1,test item,=HYPERLINK('https://some/link/or/other')

and reader code

workbook = Creek::Book.new(test_item_list_path)
sheet = workbook.sheets[0]
sheet.rows.each do |row|
    //something with the data
end

On the second iteration, the value for row should be

{"A2"=>"1", "B2"=>"test item", "C2"=>"=HYPERLINK('https://some/link/or/other')"}

but actually is

{"A2"=>"1", "B2"=>"test item", "C2"=>"0"}
DmitriyFirsov commented 1 year ago

I think my suggestion in #118 might help with this problem You can make your own value converter for your case