pudo / dataset

Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.
https://dataset.readthedocs.org/
MIT License
4.78k stars 298 forks source link

Add support for BigInteger #296

Closed conorreid closed 5 years ago

conorreid commented 5 years ago

Currently when autocreating a table from sample data, the guess method assigns all integers as the Integer type, regardless of its size. This causes problems when autocreating a table with integers larger than 2147483647 or less than -2147483648, as these are beyond the signed range of the Integer type.

I've added a check to determine if the sample integer is beyond the signed range of the basic Integer type. If that is the case, instead of creating a column with the Integer type it will create a column with the BigInteger type. This will solve the bug where a table is autocreated and then the first insert immediately errors because the integer being inserted is out of range.

This is an error I've gotten before (see below). This PR will solve that error. sqlalchemy.exc.DataError: (psycopg2.errors.NumericValueOutOfRange) integer out of range

pudo commented 5 years ago

Thanks for submitting this. I'm spooked by the idea that the first (random) value of the column will lead to such a subtle change that may come back to haunt someone. Going a bit futher, I wonder: what is the downside of just making bigint the default? We're not angling for perfect efficiency, anyways.

conorreid commented 5 years ago

If efficiency isn't really a concern then I don't think there's any downsides to making bigint the default. I have a commit that I'll gladly push that does exactly that with all tests passing, so just let me know. Thanks for taking a look!

pudo commented 5 years ago

Please do submit that :) Thanks.

conorreid commented 5 years ago

Done!