snowflakedb / dplyr-snowflakedb

SnowflakeDB backend for dplyr
Apache License 2.0
65 stars 25 forks source link

Issue with src_snowflakedb(): 'src_sql' is not an exported object #3

Open paesibassi opened 7 years ago

paesibassi commented 7 years ago

I am able to install and load the libraries, then set the classpath pointing to the latest JDBC driver (snowflake-jdbc-3.0.9.jar).

# need to load RJDBC, or error 'could not find function ".jinit"' is thrown
library(RJDBC)
library(dplyr)
library(dplyr.snowflakedb)
options(dplyr.jdbc.classpath = "drivers/snowflake-jdbc-3.0.9.jar")

When trying to setup the connection object with src_snowflakedb(), I get the following error message (I removed the account details, but they are correct in the actual code):

> nike_db <- src_snowflakedb(user = "user",
                     password = "user",
                     account = "acme",
                     opts = list(warehouse = "my_wh",
                                 db = "my_db",
                                 schema = "my_schema"))
URL: jdbc:snowflake://acme.snowflakecomputing.com:443/?account=acme&warehouse=my_wh&my_db=db&schema=my_schema
Error: 'src_sql' is not an exported object from 'namespace:dplyr'

Indeed the current version of dplyr doesn't export nor include any src_sql() function:

> dplyr:::src_sql
Error in get(name, envir = asNamespace(pkg), inherits = FALSE) : 
  object 'src_sql' not found

Is there any way to fix this?

Thank you!

paesibassi commented 7 years ago

Also posted as a question on Stackoverflow, in case it's not an actual issue (apologies in that case) and somebody has a working solution. Thank you!

paesibassi commented 7 years ago

Hello, @gregrahn are you still supporting this package? If you give me a hint on what the problem could be, I can try to fix it myself and send you a pull request. Thanks!

gregrahn commented 7 years ago

@paesibassi - Sorry, but I am no longer supporting this and unfortunately have not been following dplyr development. Quickly looking at things, it looks like the database code was factored out of dplyr and into dbplyr so the method moved packages. Easiest is probably to use dplyr 0.5.0. Probably not terribly difficult to add dbplyr as a dependency to pull in its namespace and just the rename the classes (assuming the APIs are still the same). This looks to impact dplyr::src_sql and dplyr::tbl_sql which look to become dbplyr::src_sql and dbplyr::tbl_sql.

See https://github.com/snowflakedb/dplyr-snowflakedb/blob/master/R/src-snowflakedb.R#L198-L206

If lucky it's just that easy, but hopefully that points you in the right direction.

ckeune commented 7 years ago

You are better off just using the JDBC connection in R

as.data.frame(dbGetQuery(jdbcConnection,'SELECT * FROM TABLE '))

sfc-gh-hkapre commented 7 years ago

@paesibassi The above response from @gregrahn is on the right track. There were breaking changes introduced in dplyr 0.5.0 and then the db code was factored out of dplyr and into dbplyr in 0.7.0. But 0.5.0 is not currently working. The currently working combination is dplyr 0.4.3 + dplyr.snowflakedb 0.1.1. The team here at Snowflake is currently looking into what is needed to support the latest versions.

cvbriggler commented 7 years ago

Can you provide instructions on how to install earlier versions of dplyr and dplyr.snowflakedb?

sfc-gh-hkapre commented 7 years ago

This is what is currently available in the official documentation: https://www.snowflake.net/integrating-the-snowflake-data-warehouse-with-r-via-dplyr/

We are currently working to get this updated with more detailed instructions based on additional testing.

Someone also put together a complete step by step guide based on their experience (This is not an official guide and has not been fully tested beyond what is indicated. Including in case it helps in the meantime):


  1. Install Prerequisites: https://github.com/snowflakedb/dplyr-snowflakedb/wiki/Configuring-R-rJava-RJDBC-on-Mac-OS-X

These versions worked in my case OSX | 10.2.5 JAVA6 | javaForOSX2015-001.dmg JAVA8 | 1.8.0_131, x86_6 rJava | 0.9.8 Snowflake JDBC | 3.0.9 R SOURCE | R-3.3.3 R Studio | 1.0.143

NOTE: If errors are encountered building rJava check for environment variable J$AVA_HOME. This variable should not be needed on OSX and appears to cause errors. Try unsetting it and building rJava.

  1. Launch rsStudio
  2. Install dpylr (NOTE: Must use version 0.4.3. Version 0.5 raises errors per support and testing) install_version("dplyr", version = "0.4.3", repos = "http://cran.us.r-project.org")
  3. Install dplyr-snowflakedb (NOTE: Must install v0.1.1. Version 0.2.0 req dplyr v0.5.0 which raises errors) install.packages("devtools") devtools::install_github("snowflakedb/dplyr-snowflakedb@v0.1.1")
  4. I then followed the steps at this link for demonstrating SQL pushdown https://rdrr.io/github/snowflakedb/dplyr-snowflakedb/man/src_snowflakedb.html https://github.com/snowflakedb/dplyr-snowflakedb/blob/master/man/src_snowflakedb.Rd

Note: It may appear to hang while connecting, probably waiting for DUO authentication.

Note, this pair of versions worked for most of the tests but did fail for at least one that included a windowed function. This was not tested beyond those provided in the above link so someone really exercising the package may encounter other errors.

ZacharyRSmith commented 5 years ago

for what it's worth, my fork is working with newer versions of dplyr, dbplyr, etc: https://github.com/ZacharyRSmith/RSnowflake