KentonWhite / ProjectTemplate

A template utility for R projects that provides a skeletal project.
http://projecttemplate.net
GNU General Public License v3.0
623 stars 159 forks source link

Enable dcf files to contain R code that can be evaluated upon read #169

Closed connectedblue closed 7 years ago

connectedblue commented 7 years ago

This PR proposes a change referenced in #30 and recently commented upon by @kezkankrayon.

A generic change is made to the translate.dcf function which is used in a number of places to read DCF files. In particular, this is used by the SQL reader where it was requested to support user names and passwords in environment variables.

The change is that any item in a DCF file contained in back ticks is evaluated as R code and the result used in place for that variable. This is a similar pattern to that used in R Markdown documents for evaluating code.

As an example, if a DCF file looks like this:

    user:  `Sys.getenv("user")`
   passwd:  `Sys.getenv("sql_password")`

then the user and passwd variables are set with the results of the environment variables.

A test case was also written to test the functionality, and the man page documentation is updated.

kezkankrayon commented 7 years ago

Thank you @connectedblue for publishing your solution and setting up this PR.

I'm including the code below, which is the solution I would have developed, in case it may be relevant in some way. It is an example of an alternative approach that doesn't require backticks.

str <- 'Sys.getenv("uid")'
str2 <- 'asd'

parse <- function (x) {
  out <- tryCatch(eval(parse(text=x)), error = function(e) x)
  return(out)
}

parse(str)
parse(str2)

As an aside, I have no idea how safe it is to run the eval method unchecked.

connectedblue commented 7 years ago

@kezkankrayon it was a nice, simple solution you proposed to hide the usernames and passwords in R code, so I think it's a great addition to ProjectTemplate.

I added some generic test cases to the regression tests, but not .sql scripts specifically. Would you mind installing the latest and reporting back if it solves your problem? (install_github('johnmyleswhite/ProjectTemplate')

The reason I chose the back ticks was to be consistent with what most R users would recognise from RMarkdown - that stuff inside gets evaluated. It's also somewhat of an advanced feature, so users who need it need to think carefully about it. I agree that running eval uncheckedcan be unsafe (but ProjectTemplate runs unchecked code all the time). The user needs to be responsible for using the feature.

Your code would work, but I'm not sure what the side effect would be if the config parameter also happened to be valid R - e.g. the string TRUE .

Look forward to hearing your report back on the sql use case.

kezkankrayon commented 7 years ago

@connectedblue, I installed the template via devtools, but encountered the following error when executing load.project().

Autoloading data
 Loading data set: xxxx
Error in RODBC::sqlQuery(connection, database.info[["query"]]) : 
  first argument is not an open RODBC channel
In addition: Warning messages:
1: In RODBC::odbcDriverConnect(connection.string) :
  [RODBC] ERROR: state 08001, code 0, message [unixODBC][FreeTDS][SQL Server]Unable to connect to data source
2: In RODBC::odbcDriverConnect(connection.string) :
  [RODBC] ERROR: state 01000, code 20002, message [unixODBC][FreeTDS][SQL Server]Adaptive Server connection failed
3: In RODBC::odbcDriverConnect(connection.string) : ODBC connection failed

I have not yet been able to determine if this error is a product of the update or if there is an issue else where.

So far I've done the following tests:

  1. use plain text strings
  2. uninstall, then reinstall CRAN version via install.packages 2.1. use plain text strings

However, the error remains the same. In short, I'm not sure what has changed and why plain text variables no longer work.

connectedblue commented 7 years ago

@kezkankrayon thanks for testing. It's always worth looking at specific use cases.

Can you paste the xxxx.sqlfile from the data directory that you used before and after (although mask your user name and password!). I'm not too familiar with the SQL reader but I'll have a look.

kezkankrayon commented 7 years ago

@connectedblue, please find the xxxx.sql below.

type: odbc
dsn: xxxx
user: `Sys.getenv("uid")`
password: `Sys.getenv("pwd")`
dbname: yyyy
query: SELECT * FROM [2014-s2].[collated]

This .sql file worked on the 26th when user and password were plain strings. But, hasn't worked since installing the template via devtools and reinstalling via install.packages. Do you know if anything else had changed since?

I can confirm the ODBC works and setup is correct with the following.

library(RODBC)

channel <- odbcConnect(dsn="xxxx", uid=Sys.getenv("uid"), pwd=Sys.getenv("pwd"))

odbcQuery(channel, "select name from master..sysdatabases")
odbcFetchRows(channel)

df <- sqlQuery(channel, "SELECT * FROM [2014-s2].[collated]")
connectedblue commented 7 years ago

Hmmm, there's nothing obvious that I can see, although I don't use odbc myself, so I could be missing something. I'm not sure what the difference is between dsn and dbname and why your manual step worked without specifying dbname.

There has been a recent change to this file, but that was in the SQLite section, not odbc. I wonder if @crubb has some expertise to offer to help diagnose the problem?

If it's happening on the CRAN version also, then it can't be due to this recent change, but I'd like to get to the bottom of it.

crubb commented 7 years ago

Hey guys,

the changes were only in the sqlite if block (just doublechecked), so I wouldn't expect the problem to be there, sorry ;)

Best regards, crubb

kezkankrayon commented 7 years ago

Hey, I was able to retry this on another computer today with the same xxxx.sql and it worked fine.

connectedblue commented 7 years ago

Thanks @kezkankrayon for testing this out - good to know it solves the problem of stored passwords in the odbc case.

I'm sure it will be helpful in other cases too.