PydPiper / pylightxl

A light weight, zero dependency, minimal functionality excel read/writer python library
https://pylightxl.readthedocs.io
MIT License
300 stars 47 forks source link

cell formatting #2

Closed mszbot closed 4 years ago

mszbot commented 4 years ago

Does this library read cell formats? I.e. "percentage", "text" etc?

PydPiper commented 4 years ago

Hi mszbot, thanks for taking a look at pylightxl. Currently this library only reads cell values. For example if your cell has: 10% in it or =10% both of these cells will evaluate to 0.1 within pylightxl and it will be type converted into float for you. Part of keeping this library small/simple and to the point is to not go beyond the cell data itself, therefore no cell colors, sizes, conditional formats are scraped by pylightxl. Hope this helps. Let me know if you need help getting started

mszbot commented 4 years ago

So it reads the underlying values only which are all floats, gotcha.

What's the difference between this lib and xlrd?

PydPiper commented 4 years ago

pylightxl read all cellvalues: integers, floats, text, and formulas (however as mentioned above formulas are evaluated, and only their value is stored by pylightxl).

xlrd is a great tool it's aim is similar with zero dependencies, however xlrd currently has no active developers and I don't know where the project is going to go in the future. pylightxl is very light weight and will be tailored toward user needs for quick and easy cell data read/write

mszbot commented 4 years ago

HI @PydPiper

I was watching this gif and I noted that when you read a cell the value returned using this library is as it appears in the cell. I though this library would return the underlying float? I was expecting to see 0.10 not 10

image

PydPiper commented 4 years ago
Hi mszbot, Thank you for your interest in pylightxl. The cell values are all of type INT, therefore pylightxl returned INT as well. Give it a go by reading in a cell of type INT and of type FLOAT yourself and hope everything works out as planned 😊 Let me know if you see something other than expected there Happy pythoning   From: mszbotSent: Wednesday, May 6, 2020 4:22 AMTo: PydPiper/pylightxlCc: PydPiper; MentionSubject: Re: [PydPiper/pylightxl] cell formatting (#2) HI @PydPiperI was watching this gif and I noted that when you read a cell the value returned using this library is as it appears in the cell. I though this library would return the underlying float? I was expecting to see 0.10 not 10—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or unsubscribe. 
jkyeung commented 4 years ago

I was watching this gif and I noted that when you read a cell the value returned using this library is as it appears in the cell. I though this library would return the underlying float? I was expecting to see 0.10 not 10

@mszbot - Well, I'm going to guess you meant 10.0 (rather than 0.10).

In the old BIFF files (binary .xls files), it's true that all numeric values were stored as floats (specifically, double-precision). However, with the advent of .xlsx, all the data is stored textually in XML documents. So a cell which contains 10.1 would show up in the XML with a value like this: <v>10.1</v> while a value of 10 (or 10.0) would look like this: <v>10</v>.

While xlrd (and probably Excel itself) converts all numeric values into floats, pylightxl converts the XML value into int if it can.

mszbot commented 4 years ago

I'm a xlrd user hence the 0.10 assumption.

All makes sense, I think I need to install this lib and test it myself now.

Thanks for your help.

jkyeung commented 4 years ago

I'm a xlrd user hence the 0.10 assumption.

But xlrd would give you 10.0. I do not understand where you get 0.10 from. I understand that different locales have different conventions for decimal points and thousands separators, and maybe left-to-right versus right-to-left plays into this somehow; so to be clear, we are talking about the number ten, as in, the number of fingers a typical human has, including thumbs. That would appear in the Python REPL as the integer 10 or the floating point number 10.0. If the cell had been 10%, then both xlrd and pylightxl would report it as 0.1, because that is what would have been found in the file.