PydPiper / pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library
https://pylightxl.readthedocs.io
MIT License
290 stars 47 forks source link

Error during date parsing #87

Open ypankovych opened 1 year ago

ypankovych commented 1 year ago

Pylightxl Version: 1.60 Python Version: 3.7

Summary of Bug/Feature: Error parsing dates from .xlsx file

Traceback:

Error: could not convert string to float: '2022-01-01'

Here's the file: timekeeper_template_w_div (23) (2).xlsx

Debug info i've got:

cell_type: d
cell_style: 10
styles[cell_style]: 14
cell_val: '2022-01-01'

Suggestion for fix: Add extra param to disable types inferring (in case a developer wants to cast types manually)

ErickMesquita commented 9 months ago

Hey Mr. Pankovych,

I just read and tested your code, thanks for contributing to open source! (Good job cleaning the code, by the way)

I could not make it work with the test data I tried.

Here's the test code I used (on Python 3.8 and 3.9):

import pylightxl as xl
db = xl.readxl(fn='file.xlsx', infer_types=False)
ws_name = db.ws_names[0]
print(db.ws(ws=ws_name).address(address='A1'))
# Out: '0'

Every string is being cast to some apparently random integer with loss of data. Numerical columns, such as the one in your provided xlsx, seem to work though.

I could not fully understand what you mean by:

(in case a developer wants to cast types manually)

How should the user cast types manually? Would they need to specify cell_types by hand?

ypankovych commented 2 months ago

@ErickMesquita thats expected. It actually does some extra logic to show the data from the excel as a string for you.

    if cell_type == 's':
        # commonString
        cell_val = shared_string[int(cell_val)]

And we don't run that piece if infer_types is set to False. If you don't infer the types, its all on you to cast it however you want.