PydPiper / pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library
https://pylightxl.readthedocs.io
MIT License
290 stars 47 forks source link

Not reading some cell formulas #84

Open onatyap opened 1 year ago

onatyap commented 1 year ago

Pylightxl Version: 1.61 Python Version: 3.11

Thank you for pylightxl, the performance improvement was huge from previous libraries I used :).

Summary of Bug/Feature: Not reading some cell formulas I have an MWE and traced it to the XML parsing part of readxl_scrape(). Somehow created XML doesn't has the formula for the D2 cell in this Excel, consisting of only =RAND() and + formulas; therefore, the database object created by pylightxl also doesn't have it.

import pylightxl as xl
db = xl.readxl('model2.xlsx')
db.ws(ws='Dashboard').address(address='D2', output='f') # '='

model2.xlsx

Here is the related XML part:

<c r="C2" s="2"><f t="shared" ref="C2:G2" ca="1" si="0">C19+C54</f><v>1.3094467693727863</v></c>
<c r="D2" s="2"><f t="shared" ca="1" si="0"/><v>0.19712046100983793</v></c>

Output from openpyxl:

import openpyxl as openxl
wb = openxl.load_workbook('model2.xlsx')
print(wb['Dashboard']['D2'].value) # =D19+D54

After some more inspection, Excel keeps formulas filled into adjacent cells by dragging only as a range in the cell used initially to fill others. The example above, C2 was used to fill formulas from D to G, hence the ref="C2:G2". Even though it may introduce some overhead and look like an edge case with an easy workaround (retyping formulas), I think it is better to be safe.

PydPiper commented 1 year ago

@onatyap thank you for considering using pylightxl and submitting this issue. There always seems to be a nuance as to how spreadsheets are written! I will take a look at what we can do about this for the next release!