PydPiper / pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library
https://pylightxl.readthedocs.io
MIT License
290 stars 47 forks source link

I can not read file #63

Open Aitor13 opened 2 years ago

Aitor13 commented 2 years ago

Pylightxl Version: 1.58 Python Version: 3.9.2

Summary of Bug/Feature: I can not pass the path of file to read with "xl.readxl(path)"

Traceback: TypeError Traceback (most recent call last)

in 3 path = pathlib.Path(r'C:\Users\Downloads\Report .xlsx') 4 print(path) ----> 5 db = xl.readxl(spath) 6 print(db) c:\users\aitor.fernandez\appdata\local\programs\python\python39\lib\site-packages\pylightxl\pylightxl.py in readxl(fn, ws) 117 # {'ws': ws1: {'ws': str, 'rId': str, 'order': str, 'fn_ws': str}, ... 118 # 'nr': {nr1: {'nr': str, 'ws': str, 'address': str}, ...} --> 119 wb_rels = readxl_get_workbook(fn) 120 121 for nr_dict in wb_rels['nr'].values(): c:\users\aitor.fernandez\appdata\local\programs\python\python39\lib\site-packages\pylightxl\pylightxl.py in readxl_get_workbook(fn) 232 # the output of openpyxl can sometimes not write the schema for "r" relationship 233 rId = tag_sheet.get('id') --> 234 sheetId = int(re.sub('[^0-9]', '', rId)) 235 wbrels = readxl_get_workbookxmlrels(fn) 236 rv['ws'][name] = {'ws': name, 'rId': rId, 'order': sheetId, 'fn_ws': wbrels[rId]} c:\users\aitor.fernandez\appdata\local\programs\python\python39\lib\re.py in sub(pattern, repl, string, count, flags) 208 a callable, it's passed the Match object and must return 209 a replacement string to be used.""" --> 210 return _compile(pattern, flags).sub(repl, string, count) 211 212 def subn(pattern, repl, string, count=0, flags=0): TypeError: expected string or bytes-like object **Suggestion for fix**:
steven-tjong commented 2 years ago

I also have a similar problem now. Did you happen to edit the file with openpyxl before?

PydPiper commented 2 years ago

Hey everyone, sorry for the delay on addressing this its finals week. This should be resolved next week. Thank you for your patience

Aitor13 commented 2 years ago

I also have a similar problem now. Did you happen to edit the file with openpyxl before?

Yes, with openpyxl no problem

Aitor13 commented 2 years ago

Hey everyone, sorry for the delay on addressing this its finals week. This should be resolved next week. Thank you for your patience

Thank you!

steven-tjong commented 2 years ago

Is there a way for us to help btw?

PydPiper commented 2 years ago

Is there a way for us to help btw?

yes, of course! If you guys figure out the bug and send in a merge request, those are usually easier and faster for me to incorporate

PydPiper commented 2 years ago

@Aitor13 I was not able to replicate the error. I altered an existing excel file with openpyxl as it appears that is what you have done and added a named range since the trace appears to stop there but the file was able to be read back in to pylightxl. Can you send a simple excel file where this error exists?

steven-tjong commented 2 years ago

I added a formula that is only available for microsoft excel 365 to encounter that problem btw, formulas like this:

worksheet['C5'].value = '=UNIQUE(B5#)'

which gives me the following error if opened with pylightxl:

Traceback (most recent call last):
  File "/Users/aa/Documents/excel_debug.py", line 13, in main
    db = xl.readxl(f)
  File "/Users/aa/venv/lib/python3.8/site-packages/pylightxl/pylightxl.py", line 119, in readxl
    wb_rels = readxl_get_workbook(fn)
  File "/Users/aa/venv/lib/python3.8/site-packages/pylightxl/pylightxl.py", line 234, in readxl_get_workbook
    sheetId = int(re.sub('[^0-9]', '', rId))
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/re.py", line 210, in sub
    return _compile(pattern, flags).sub(repl, string, count)
TypeError: expected string or bytes-like object

Process finished with exit code 1

More info:

  1. The syntax # only available in excel 365 and openpyxl works with excel 2010
  2. But more importantly, normal excel will also fail to open the file (having formula from 365 but saved as 2010): Normal excel will delete the cell 'C5' instead

Maybe the error is then out of the scope of this library?

PydPiper commented 2 years ago

@steven-tjong hey i took a deeper look at the new office 365 "spill array" which is what the # is in reference to and without getting 365 myself i wouldnt be able to dig further into it. i am not sure what the structure of the xml is for such reference. However we can see if i can dig into it if you send a office 365 file with a simple unique(B5#) used in it.

steven-tjong commented 2 years ago

Really sorry for the late reply, I am also somehow unable to have it working on two of my laptops anymore...

image

PydPiper commented 2 years ago

@steven-tjong right, thank you for posting. Yes currently pylightxl does not support the new office 365 features such as the one you showed in the figure (spill arrays). I will keep this issue open until pylightxl addressed the new 365 features, but for the mean time i will add a disclaimer/limitation to pylightxl that it does not support new 365 features like spill arrays.