wireservice / agate-sql

agate-sql adds SQL read/write support to agate.
https://agate-sql.readthedocs.io
MIT License
18 stars 15 forks source link

Make changes to type testing in table.py #5

Closed aklaver closed 9 years ago

aklaver commented 9 years ago

Changes to type testing using the python_type attribute of a column to compare to native Python types and select correct agate type. I tried to run the tests, but I am having an as yet unresolved issue with pyasn1.

onyxfish commented 9 years ago

Hey Adrian,

So I'm staring at this and wondering if it's possible I didn't discover python_type in my hours of fiddling with this. It's either that or I tried it this way and ran into an issue. Regardless, if this works, it's a huge improvement! Thanks for taking a run at it!

One issue I think you're going to run into is float conversion. agate.Number intentionally breaks on floats, because they can't always be accurately converted to Decimal. So in that case we'll need to pre-convert them. (I'd certainly be open to modifying this agate behavior if you've got any ideas.)

aklaver commented 9 years ago

On 10/29/2015 04:21 PM, Christopher Groskopf wrote:

Hey Adrian,

So I'm staring at this and wondering if it's possible I didn't discover |python_type| in my hours of fiddling with this. It's either that or I tried it this way and ran into an issue. Regardless, if this works, it's a huge improvement! Thanks for taking a run at it!

One issue I think you're going to run into is |float| conversion. |agate.Number| intentionally breaks on floats https://github.com/onyxfish/agate/blob/master/agate/data_types/number.py#L66, because they can't always be accurately converted to |Decimal|. So in that case we'll need to pre-convert them. (I'd certainly be open to modifying this agate behavior if you've got any ideas.)

How did you deal with it before?

Given :

In [1]: from sqlalchemy.types import *

In [2]: real_type = REAL()

In [3]: real_type.python_type Out[3]: float

In [4]: float_type = FLOAT()

In [5]: float_type.python_type Out[5]: float

Then:

if sql_type in [BIGINT, DECIMAL, FLOAT, INTEGER, NUMERIC, REAL, SMALLINT]: column_types.append(agate.Number())

would have put floats in an agate.Number() field

— Reply to this email directly or view it on GitHub https://github.com/onyxfish/agate-sql/pull/5#issuecomment-152354991.

Adrian Klaver adrian.klaver@aklaver.com

onyxfish commented 9 years ago

That is an excellent question. To be 100% honest I do not know why it works. My best guess is that I simply failed to test thoroughly. (The unit tests for this module are exceptionally thin.)

aklaver commented 9 years ago

On 10/29/2015 05:11 PM, Christopher Groskopf wrote:

That is an excellent question. To be 100% honest I do not know why it works. My best guess is that I simply failed to test thoroughly. (The unit tests for this module are exceptionally thin.)

Part of the issue may be that Sqlite tends to be forgiving when it comes to types:

http://www.sqlite.org/datatype3.html

2.0 Type Affinity

Each column in an SQLite 3 database is assigned one of the following type affinities:

 TEXT
 NUMERIC
 INTEGER
 REAL
 BLOB

A column with NUMERIC affinity may contain values using all five storage classes. When text data is inserted into a NUMERIC column, the storage class of the text is converted to INTEGER or REAL (in order of preference) if such conversion is lossless and reversible. For conversions between TEXT and REAL storage classes, SQLite considers the conversion to be lossless and reversible if the first 15 significant decimal digits of the number are preserved. If the lossless conversion of TEXT to INTEGER or REAL is not possible then the value is stored using the TEXT storage class. No attempt is made to convert NULL or BLOB values.

A string might look like a floating-point literal with a decimal point and/or exponent notation but as long as the value can be expressed as an integer, the NUMERIC affinity will convert it into an integer. Hence, the string '3.0e+5' is stored in a column with NUMERIC affinity as the integer 300000, not as the floating point value 300000.0.

A column that uses INTEGER affinity behaves the same as a column with NUMERIC affinity. The difference between INTEGER and NUMERIC affinity is only evident in a CAST expression.

A column with REAL affinity behaves like a column with NUMERIC affinity except that it forces integer values into floating point representation. (As an internal optimization, small floating point values with no fractional component and stored in columns with REAL affinity are written to disk as integers in order to take up less space and are automatically converted back into floating point as the value is read out. This optimization is completely invisible at the SQL level and can only be detected by examining the raw bits of the database file.)

— Reply to this email directly or view it on GitHub https://github.com/onyxfish/agate-sql/pull/5#issuecomment-152362108.

Adrian Klaver adrian.klaver@aklaver.com

onyxfish commented 9 years ago

That does explain the tests passing. Locally I've been testing against Postgres as well, though I'm not certain I've tested any floats. (My focus until now has been on agate core.)

aklaver commented 9 years ago

On 10/29/2015 05:27 PM, Christopher Groskopf wrote:

That does explain the tests passing. Locally I've been testing against Postgres as well, though I'm not certain I've tested any floats. (My focus until now has been on agate core.)

The below looks interesting(asdecimal part) , just not sure yet where to do the conversion yet.

http://docs.sqlalchemy.org/en/rel_1_0/core/type_basics.html

class sqlalchemy.types.Float(precision=None, asdecimal=False, decimal_return_scale=None, **kwargs)

.....

— Reply to this email directly or view it on GitHub https://github.com/onyxfish/agate-sql/pull/5#issuecomment-152364256.

Adrian Klaver adrian.klaver@aklaver.com

onyxfish commented 9 years ago

Good find! If that works universally that would make things tremendously easier.

aklaver commented 9 years ago

Added code to deal with floats. So in Postgres:

test=> create table agate_test (id int, fl_fld float); CREATE TABLE

test=> insert into agate_test values(1, 2.35); INSERT 0 1 test=> insert into agate_test values(2, 1.333); INSERT 0 1 test=> insert into agate_test values(3, 5.698); INSERT 0 1

new_table = agate.Table.from_sql('postgresql://aklaver:@localhost:5432/test', 'agate_test')

In [15]: new_table.rows[0][1] Out[15]: Decimal('2.3500000000')

In [16]: new_table.rows[1][1] Out[16]: Decimal('1.3330000000')

In [17]: new_table.rows[2][1] Out[17]: Decimal('5.6980000000')

aklaver commented 9 years ago

Just saw test fail. Forgot about Python 3. How are you dealing with 2/3 conversions?

onyxfish commented 9 years ago

I'm not 100% sure I know what you are asking, but we do want 2/3 compatibility for this module. You're welcome to add a dependency on six (agate already depends on it). The string check should probably use six.string_types.

onyxfish commented 9 years ago

Oh and there is a tox config already in the repo for testing against the 6 versions of Python agate supports.

aklaver commented 9 years ago

On 10/30/2015 09:19 AM, Christopher Groskopf wrote:

I'm not 100% sure I know what you are asking, but we do want 2/3 compatibility for this module. You're welcome to add a dependency on six https://pythonhosted.org/six/ (agate already depends on it). The string check should probably use |six.string_types|.

Thanks, that is what I am looking for. I have just started rewriting my own code to be 2/3 compatible and the thing I have found is that there is a multitude of ways to do that. I did not want to introduce another method to this code.

— Reply to this email directly or view it on GitHub https://github.com/onyxfish/agate-sql/pull/5#issuecomment-152572788.

Adrian Klaver adrian.klaver@aklaver.com

onyxfish commented 9 years ago

Much appreciated! agate is my third or fourth library that's 2/3 compatible so lemme know if you run into any issues.

aklaver commented 9 years ago

Add code to deal with string differences between Python 2 and 3. If code is being used under Python 2 convert str and unicode types into basestring. This allows test against six.string_types to work in the if/elif/else type checking section. Tested locally against Python 2 and 3.

onyxfish commented 9 years ago

Actually, there is a cleaner way to handle this:

elif py_type is six.string_types:

Note is, not in.All string types are subclasses of six.string_types regardless of which version of Python your in. So you don't need the if six.PY2 block at all.

onyxfish commented 9 years ago

Adrian, I went ahead and did some very minor cleanup and merged this in. Thanks again for hacking on it! I've added you to the AUTHORS file. Feel free to pick up any other tickets that interest you!

(I was wrong in my previous comment, since its a type and not an instance you have to use issubclass, but you can still avoid the if block.)

aklaver commented 9 years ago

On 10/30/2015 05:36 PM, Christopher Groskopf wrote:

Adrian, I went ahead and did some very minor cleanup and merged this in. Thanks again for hacking on it! I've added you to the AUTHORS file. Feel free to pick up any other tickets that interest you!

Thanks. The issubclass test is a lot cleaner then my solution. I can take a swing at #2. I see the TimeDelta data type in agate and the commented out entry in SQL_TYPE_MAP. SQLAlchemy has the INTERVAL type that outputs datetime.timedelta. Is it just a matter of hooking the types together or is there something I am missing?

(I was wrong in my previous comment, since its a type and not an instance you have to use |issubclass|, but you can still avoid the |if| block.)

— Reply to this email directly or view it on GitHub https://github.com/onyxfish/agate-sql/pull/5#issuecomment-152680150.

Adrian Klaver adrian.klaver@aklaver.com

onyxfish commented 9 years ago

I think it's just a matter of wiring it together, but I'm honestly not sure. I didn't get as far as figuring out what databases actually generate that type.

aklaver commented 9 years ago

I will move the TimeDelta discussion over to #2.