r-dbi / DBI

A database interface (DBI) definition for communication between R and RDBMSs
https://dbi.r-dbi.org
GNU Lesser General Public License v2.1
297 stars 75 forks source link

Problems with dbFetch() dealing with MS SQL Server on cloud when type of last column is of type Bit #315

Closed zyxdef closed 4 years ago

zyxdef commented 4 years ago

Dear gentlemen,

To start off, I'm a huge fan of your work, which divides R history in bTV and aTV (before and after tidyverse)!

I hope I present enough information to make the error in this issue reproducible.

To put it plainly, I get the cryptic error message

Error in result_fetch(res@ptr, n) : 
  nanodbc/nanodbc.cpp:2966: 00000: [Microsoft][ODBC Driver 17 for SQL Server]Invalid Descriptor Index

every time I call DBI::dbFetch() on the result of a query on tables that end with column of type Bit (i.e., as.numeric(tail(DBI::dbColumn(result)$type,1))==-7) ) on a MS SQL Server on AWS (I don't how how far that is relevante, but there it is; also, I had no influence in the choice of SQL server flavor).

My concern is that, since it only occurred in the specific combination of factors I'm running under, it could be a loose end of some part of the code so hard to manifest that it have not occurred until now.

For that reason, I felt I should fill an issue here in GitHub.

After two days headbutting this problem and after a lot of combinatorially testing of some possible factors, I guess I found out that it boiled down to the last column being of type "Bit".

To be clear, dbFetch() works properly if I remove the last variable (as long the type of the last but one variable is not Bit), if there are variables of type bit in the middle but are not in the end, etc.

To my surprise, after several Googles, I found no connection between dbFetch() and last column of bit type issuing such an error [I even found connections between dbFetch() and "Invalid Descriptor Index", but not with Bit type nor with MS SQL Server nor cloud specifically].

Since the server is running on the AWS cloud environment, I will not be able to present a real minimum reproducible example, for I would have to present endpoint, uid and pwd, but my code goes like this

conn <-
  DBI::dbConnect(
    drv = odbc::odbc(),
    Driver = "ODBC Driver 17 for SQL Server",
    Server = "validEndpointURL",
    Database = "validDatabaseName",
    UID = "validUserName,
    PWD = "myPassword",
    Port =     1433
  ) # which connects fine
# validTableName1 is a table which does *not* end with a bit column
DBI::dbGetQuery(conn, "select * from validSchemaName.validTableName1") # works ok
# validTableName2 *ends* *with* a bit column
DBI::dbGetQuery(conn, "select * from validSchemaName.validTableName2") # shoots the error message
result <- DBI::dbSendQuery(conn, "select * from validSchemaName.validTableName2") # no messages
DBI::dbColumnInfo(result) # prints cols and types just fine, with "-7" in the last line of column `type`
DBI::dbFetch(result) # shoots the error message

I have asked for collaboration on an R user community in Telegram to see if other have the same issue with DBI in other environments (whose chance I assume must be low), but just to confirm. I guess soon I will have more to say about it.

A great many thanks!

Marcelo

krlmlr commented 4 years ago

Thanks. This looks like https://github.com/r-dbi/odbc/issues/309. The FreeTDS driver, and RStudio's Professional drivers, don't have this issue.

zyxdef commented 4 years ago

Many thanks for your response and many others more for the dedication in the development of all the packages!

Curiously, in r-dbi/odbc#309, the issue occurred when dealing with blob columns in the middle of the table, but, in the case I was dealing with, it happened with bit columns in the end of the table. Also, there were tables with bit columns in the middle, but it raised no such issue. In common, both problems occurred interfacing with MS SQL Server.

Just for sake of information for the next person who stumbles in the issue, the FreeTDS library that @krlmlr cited is the accessible here and can be installed via sudo apt install.

In the other hand, if you don't have RStudio's professional products, then you cannot access the professional drivers, correct? So it is end box in this branch of the fluxogram, right?

Also, these workarounds in r-dbi/odbc#309 worked for me.

nickv9 commented 4 years ago

Curiously, in r-dbi/odbc#309, the issue occurred when dealing with blob columns in the middle of the table, but, in the case I was dealing with, it happened with bit columns in the end of the table. Also, there were tables with bit columns in the middle, but it raised no such issue.

It is possible there are "long" columns in the middle of the table giving you problems. One way to check whether that is the case is to:

  1. Run a query, explicitly listing the columns in your select statement in the order you get the columns back today - verify you still see the issue

  2. Modify your query to move the last column (the bit type) in your select statement to the beginning of your select statement - note whether the error is returned when you run the modified query. That would indicate one of more of the columns now at the end of the statement are "long"

zyxdef commented 4 years ago

The suggestion I was given from the dba was to use CAST() function on the bit column, in order to convert it to CHAR(1), but it didn't work. I guess I'm gonna try your suggestion as soon as I document that dba suggestions haven't worked. Thanks, by the way.

zyxdef commented 4 years ago

Curiously, in r-dbi/odbc#309, the issue occurred when dealing with blob columns in the middle of the table, but, in the case I was dealing with, it happened with bit columns in the end of the table. Also, there were tables with bit columns in the middle, but it raised no such issue.

It is possible there are "long" columns in the middle of the table giving you problems. One way to check whether that is the case is to:

1. Run a query, explicitly listing the columns in your select statement in the order you get the columns back today - verify you still see the issue

2. Modify your query to move the last column (the bit type) in your select statement to the beginning of your select statement - note whether the error is returned when you run the modified query. That would indicate one of more of the columns now at the end of the statement are "long"

It worked like a charm.

So that the secret? Placing short columns in the beginning and large columns ate the end?

krlmlr commented 4 years ago

Install a better ODBC driver if you can. Otherwise, yes, long columns must come at the end. This is a big pain especially when using tools like dbplyr.

nickv9 commented 4 years ago

So that the secret? Placing short columns in the beginning and large columns ate the end?

Yes, that is the secret :). Glad it helped!

As @krlmlr mentioned, some ODBC drivers, like the commercial (paid) SQL Server driver from RStudio does not have the problem.

nickv9 commented 4 years ago

Install a better ODBC driver if you can.

@krlmlr - other than the paid RStudio driver, do you know of a SQL Server odbc driver that does NOT have this problem?

krlmlr commented 4 years ago

FreeTDS works for me.

r2evans commented 4 years ago

@krlmlr, you're not on windows, are you? I've looked and haven't found any ODBC driver for windows that was other than MS. (RStudio says that "ODBC database drivers are usually pre-installed with the Windows operating systems", and I can't find any windows ODBC drivers on FreeTDS's website. The best I found from them is their appveyor build history.) Have you found any other ODBC drivers for windows?

r2evans commented 4 years ago

(I'll reply to myself :-)

Not so obviously documented is that one can go into the ./bin/ directory of one of the builds and regsvr32 tdsodbc.dll to register it as an ODBC driver. I was only able to install from the vs2017_64-Branch-1_2_withoutssl.zip file; not sure if I'm missing something, but I am still able to specify Encrypt=yes on the ODBC connection, suggesting either (a) the encrypting is not reliant on ssl, or (b) it is not telling me that the option is being ignored.

I find the hint here: https://lists.ibiblio.org/pipermail/freetds/2016q3/029980.html

krlmlr commented 4 years ago

Eventually there will be MSI installers from FreeTDS AppVeyor builds, I hope: https://github.com/FreeTDS/freetds/issues/355. Other than that there seems to be a thriving ecosystem of various commercial ODBC drivers for SQL Server on Windows.

r2evans commented 4 years ago

Thanks, @krlmlr. You mentioned (in freetds#355) that you're using the .zip file on windows. I just ran into significant problems with a memory leak (from a build in May/June) that tool down R and emacs rather hard, every time I used the freetds odbc driver. Since I use the mssqlodbc driver all day every day and have no problems the only different element was freetds, so I'm now significantly averse to using them until I can resolve that problem.

Which build from appveyor are you using? Have you been using it enough to feel comfortable about leaks and such?

krlmlr commented 4 years ago

I'm not using FreeTDS on Windows actively enough to be able to comment on that.

The builtin MSSQL driver is dead to me because it chokes if BLOB columns are not at the end of the column list. That's not a problem if you don't have blobs of course.

r2evans commented 4 years ago

Many of the tables I have to work with have varchar(max) (or some nvarchar(.) above 255) at the end, so yes ... I frequently-enough forget and have to re-dbConnect. Thanks, I was hoping you have a stable version you preferred.

And ... count yourself lucky :-)

github-actions[bot] commented 3 years ago

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.