Open thmsklngr opened 4 years ago
Hi @mosesontheweb, the schema is inferred by default from a sample of the first 1000 rows of the table, see create_table for implementation. The schema inference is handled by SQLAlchemy. You may be able to resolve this either by increasing the number of rows sampled or by providing some metadata to help the schema inference. Apologies I don't have time to dig further at the moment.
I'll try to increase the sample number, maybe this may solve that issue. It's weird that the longer values start somewhere from entry #5000+ in my example ...
Regards, Thomas
Same issue here when using fromdb()
-> todb()
. Basically trying to copy one table to another sql database. However, table load fails on certain varchar() columns, Upon creation, varchar length does not fit all rows. Using the sample=0
arg normally causes todb() to run extremely slowly. Current table is ~500k rows.
`
engine1 = create_engine("mssql+pyodbc:///?odbc_connect=%s" % cloud_params)
engine2 = create_engine("mssql+pyodbc:///?odbc_connect=%s" % prem_params)
table = etl.fromdb(engine2, 'SELECT * from bigtable')
etl.todb(table, engine1,tablename='newbigtable', schema='stg', create=True,dialect='mssql', commit=True, sample=10000)
`
Minimal, reproducible code sample, a copy-pastable example if possible
Problem description
It seems that the table is not created properly using the maximum length of the longest value of 'Name'
I was looking for a way to get information about the types used or how to set the VARCHAR length used for creating the table in the database, but my quest wasn't successful.
Version and installation information
petl.__version__
: 1.4.0