zillow / ctds

Python DB-API 2.0 library for MS SQL Server
MIT License
83 stars 12 forks source link

Bulk inserting empty strings #35

Closed kadrach closed 5 years ago

kadrach commented 5 years ago

The callproc docs state that:

Currently FreeTDS does not support passing empty string parameters. Empty strings are converted to NULL values internally before being transmitted to the database.

It looks like this is happening with connection.bulk_insert as well, i.e. inserting empty strings leaves me with NULL in the database.

Just after some clarification whether this is expected behaviour, only applicable to callproc, or I'm doing something wrong.

joshuahlang commented 5 years ago

The empty strings don't work with callproc due to the design of the dblib API. I'll have to go back and look through the code to see if the limitation also exists in the bcp APIs or the ctds code is just assuming empty string support is always missing

kadrach commented 5 years ago
% cat foo.csv
0,郝         
1,""         
2,﷽          
3,           

First four records - ctds Second four records - freebcp

+------+-----------+
| id   | string |
|------+-----------|
| 0    | 郝        |
| 1    | ""      |
| 2    | ﷽         |
| 3    | NULL      |
| 0    | 郝        |
| 1    | ""        |
| 2    | ﷽         |
| 3    | NULL      |
+------+-----------+

Edit: Force QUOTE_NONE in Python

Doing a bit more research, this seems to be a "feature" of the BCP utility. There's a whole page on technet addressing this to some extent.

joshuahlang commented 5 years ago

I have a branch that implements this, but it appears to trigger some FreeTDS debug asserts. I'll try to dig in more when I have some time, but I'd guess this is going to be limited by FreeTDS much like the callproc behavior is.

r313pp commented 5 years ago

@joshuahlang starting from 0.95 assert(converted_data_size > 0); has been removed. So only 0.91.112 and 0.92.405 in CHECKED_FREETDS_VERSIONS have assert problem. I haven't dug dipper so I don't know that this is malplaced assert or real problem in those versions. But I removed those two freetds versions (also added 1.1) and merged bulk-empty-strings, datetime2-support and all tests passed (at least on Linux). You can see it hire https://travis-ci.com/r313pp/ctds/builds/107753091. UPDATE: valgrind has failed.

joshuahlang commented 5 years ago

Interesting. I am seeing a failure locally with FreeTDS 1.1 in TestCursorRowCount.test_rowcount. Strange the travis CI build didn't produce that.

r313pp commented 5 years ago

Yes, i fixed that test and forgot to mention it. I saw that you added 1.1 to check list.

I don't think that those problems would be fixed in previous freetds versions. So what do you think about the way forward?

joshuahlang commented 5 years ago

I didn't realize there was a 1.1.4 version until after I pushed the change. I'm currently working through some issues with the datetime2 changes in a new datetime2 branch. The older branch was failing on appveyor which uncovered some deficiencies in the unit tests and code.

joshuahlang commented 5 years ago

I merged a fix for this to master. It requires FreeTDS 0.95+, otherwise the previous behavior is used (and a warning is raised)

r313pp commented 5 years ago

Yes, I saw the pull request. My internal tests has passed. Thank you, Sir, for your work!

joshuahlang commented 5 years ago

Fix in 1.10.0.