denisenkom / pytds

Python DBAPI driver for MSSQL using pure Python TDS (Tabular Data Stream) protocol implementation
MIT License
192 stars 53 forks source link

Let user have better control over treatment of NULL values with bulk/… #71

Closed sylvainr closed 7 years ago

sylvainr commented 7 years ago

Problem

A blocking issue right now with the copy_to method which performs BULK INSERT operations, is how NULL values are treated.

As mentioned in issue #61, the treatment of empty string, which is the only way to provide a missing entry using a CSV file will be interpreted as an empty string, when the field is of NVARCHAR sort, but will be converted to default values for other fields such as 0.0 for FLOAT columns.

This is a deal breaker for a lot of use cases, including mine, where I have to insert float values that can potentially be NULL.

Solution

The solution is the addition of the optional null_string parameters that, when specified, will ensure any matching string in the input CSV file will be converted to a NULL value.

From an API standpoint, it is the same way psycopg does it with the null field which is actually active by default (cf http://initd.org/psycopg/docs/cursor.html#cursor.copy_to)

Please note that this solution will not affect in anyway the usage without this field from both a functional and a performance stand point.

@denisenkom and @cristianocoelho what do you think?

I am super excited, this is a my first pull request!

denisenkom commented 7 years ago

Looks good to me