Open apelade opened 9 years ago
So far I have tried a combination of drivers:
mssql+adodbapi: did not try - requires sqlalchemy @ version .6 latest is .9 mssql+pyodbc: requires python 3.3, test and mssql random func do insert data, error on test cases https://github.com/18F/rdbms-subsetter/pull/12 mssql+pymssql: trying this now, compatible with latest versions of dependencies. It says pysmssql is treating Decimal datatypes as float.
Both have a memory error when run on our real database, at least with some run parameters supplied.
Running with the mssql random function name as "rand()", https://github.com/apelade/rdbms-subsetter/commit/566d70b32ba2a471d99e229cef014635913f16c2 on our real 250 GB database, getting memory error with both pymssql and pyodbc. Getting an estimated number of rows might help?
Command:
rdbms-subsetter -l mssql+pymssql://sa:password1@localhost:1433/source_dev mssql+pymssql://sa:password1@localhost:1433/dest_dev_sm 0.02
Error:
...
Create 1 rows from 920634 in .be_dstrb_parm_data
Create 1 rows from 921077 in .be_web_notices
Create 1 rows from 923883 in .be_dstrb
Create 1 rows from 9743 in .be_parm_data_h
Create 1 rows from 994071 in .be_barn_estmt_h
Proceed? (Y/n) y
INFO:root:lowest completeness score (in tp_415_apye_over_limit_2012) at 0.000000
C:\Python34\lib\site-packages\sqlalchemy\dialects\mssql\pymssql.py:31: SAWarning: Dialect mssql+pymssql does not support Decimal objects natively, and SQLAlchemy must convert from floating point - rounding errors and other
issues may occur. Please consider storing Decimal numbers as strings or integers on this platform for lossless storage.
return sqltypes.Numeric.resultprocessor(self, dialect, type)
INFO:root:lowest completeness score (in be_stat_rsn_ref_h) at 0.000000
INFO:root:lowest completeness score (in tp_415_apye_over_limit_2013) at 0.000000
INFO:root:lowest completeness score (in be_prcs_h) at 0.000000
Traceback (most recent call last):
File "C:\Python34\Scripts\rdbms-subsetter-script.py", line 9, in
_find_n_rows()
doesn't try to handle mssql drivers, and later in update_sequence()
there is a comment about only pg being supported, but not sure if sql server would need that
Try smaller databases
MS SQL Server has been fairly popular in government, as well as the small businesses around government. It would be nice to support that.