PydPiper / pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library
https://pylightxl.readthedocs.io
MIT License
290 stars 47 forks source link

OverflowError when parsing date #91

Open cc-jdurbin opened 1 year ago

cc-jdurbin commented 1 year ago

Pylightxl Version: 1.61 Python Version: 3.10.9

Summary of Bug/Feature:

readxl_scrape fails on line 536 when cell_val is 8445481.

Traceback:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in rxl
  File "[path_to_pylightxl]\pylightxl\pylightxl\pylightxl.py", line 152, in readxl
    data = readxl_scrape(fn, fn_ws, sharedString, styles, comments)
  File "[path_to_pylightxl]\pylightxl\pylightxl\pylightxl.py", line 536, in readxl_scrape
    dt = EXCEL_STARTDATE + timedelta(cell_val)
OverflowError: date value out of range

Suggestion for fix:

An option to let readxl_scrape quietly fail and insert a blank value would work for my specific use case. Alternatively, an option to not parse date values at all would work. Developers can take care of fancier date parsing after the call to readxl.

cc-jdurbin commented 1 year ago

Update: I was able to resolve this by slightly modifying readxl_scrape:

                if cell_val > 2958465:
                    dt = EXCEL_STARTDATE + timedelta(2958465)
                else:
                    dt = EXCEL_STARTDATE + timedelta(cell_val)

Where 2958465 is the maximum value that doesn't cause an OverflowError. I don't know enough to understand if this is a robust solution, but it works for my use case.