r-dbi / odbc

Connect to ODBC databases (using the DBI interface)
https://odbc.r-dbi.org/
Other
387 stars 107 forks source link

Unable to list columns in Hive #270

Open nwstephens opened 5 years ago

nwstephens commented 5 years ago

We’re having a hard time reading the catalog for our Hive instance. It’s possible that the Hive metadata are messed up, but the fact I can query data would suggest otherwise. The problem is that the odbcListColumns is not picking up the table info (and the connections pane also shows no table details). This is easy to reproduce on desktop pro. Would you mind taking a look?

con <- dbConnect(odbc::odbc(),
                 Driver = "hive",
                 Host = "***",
                 Port = 32796,
                 UID = "***",
                 PWD = "***"
)
odbc::odbcListColumns(con, table = "test_table", schema = "default") # empty
DBI::dbReadTable(con, "test_table") # succeeds

odbcListObjects returns the correct schema, so that function is working. But odbcListColumns returns nothing. It is possible that the Hive metadata are messed up (wouldn’t be the first time).

rnorberg commented 3 years ago

I think this is a driver issue. odbc::odbcListColumns() works for me when using the Simba Hive ODBC driver version 2.1.14.1020 (repackaged and distributed as part of the RStudio Pro ODBC Drivers bundle), but it doesn't work for me when using version 2.6.5.1005 (repackaged and distributed as Hortonworks Hive ODBC Driver). I wonder if it has something to do with Hive's convention of referring to "schemas" (in normal RDBMS parlance) as "databases".