mkleehammer / pyodbc

Python ODBC bridge
https://github.com/mkleehammer/pyodbc/wiki
MIT No Attribution
2.92k stars 561 forks source link

native INSERT BULK like .Net SqlBulkCopy #408

Closed gizmo93 closed 6 years ago

gizmo93 commented 6 years ago

Hello together

are there any plans (or at least would there be the technical possibility) to add real bulk inserts like SSIS or .Net (https://msdn.microsoft.com/en-us//library/system.data.sqlclient.sqlbulkcopy(v=vs.110).aspx) are performing them from in memory data structures like lists or tuples?

Importing flat files directly (like using the OPENROWSET command) is only an option if they mostly contain numerical data. If they contain strings or other data types, the risk loosing data because of line breaks or bad escaping is just too high.

I think supporting this really would greatly increase the user experience of using MSSQL and python together as there is no other fast solution to insert lots of data. (fast_executemany did a really good job in speeding up inserts but its still behind the possibilities of SSIS or .Net bulk inserts).

v-chojas commented 6 years ago

INSERT BULK and the corresponding SqlClient feature is specific to SQL Server; while pyODBC does have some DB-specific code, that is all accomplished through what is otherwise the standard ODBC interface. I.e. pyODBC uses the driver manager to load an ODBC driver, and then talks to the driver through the DM. Using the BCP API (which is essentially what you're asking?) is very different, in that it involves explicitly loading and calling specific functions exported from the msodbcsql ODBC driver itself. This does not seem to be something pyODBC can easily accommodate.

gizmo93 commented 6 years ago

Hey @v-chojas

thanks for the feedback, i did not know that it's such a specific feature of the MSSQL driver. But the hint regarding the BCP API led me to https://zillow.github.io/ctds/index.html which seems to be new and implements the bulk copy feature. Need to play around with it a little bit.

gordthompson commented 6 years ago

( Duplicate of #350 )

v-chojas commented 6 years ago

Leave a comment@gizmo93 yes, as you can see on that page you linked "•Supports Microsoft SQL Server 2008 and up." so that is another SQL Server specific (non-ODBC) library, it is not a generic ODBC interface unlike pyODBC.