dfe-analytical-services / dfeR

Common R tasks in the Department for Education (DfE)
https://dfe-analytical-services.github.io/dfeR/
GNU General Public License v3.0
8 stars 2 forks source link

Add in read_sql() type function #55

Closed cjrace closed 3 months ago

cjrace commented 5 months ago

There were sql_conn_string() and read_sql_script() in the old package, my team has updated code for these that should be added, including read_sql(), where it automatically cleans a .sql script to read in via R, dodging common pitfalls.

Should also make the function add the two lines that we usually recommend adding to the top of SQL scripts that are read by R:

 SET ANSI_PADDING OFF
 SET NOCOUNT ON;
cjrace commented 5 months ago

Don't know if these are the latest versions but I found them in one of our repos

# Get SQL code and tidy it up so it can run-------------------
getSQL <- function(filepath) {
  con <- file(filepath, "r")
  sql.string <- ""

  while (TRUE) {
    line <- readLines(con, n = 1)

    if (length(line) == 0) {
      break
    }

    line <- gsub("\\t", " ", line)
    line <- gsub("\\n", " ", line)

    if (grepl("--", line) == TRUE) {
      line <- paste(sub("--", "/*", line), "*/")
    }

    sql.string <- paste(sql.string, line)
  }

  close(con)
  return(sql.string)
}

# Clean SQL code---------------------------------------------
cleanSQL <- function(code) {
  code <- gsub("", " ", code) # clean up SQL - remove weird sign that appears sometimes when reading data in
}
cjrace commented 5 months ago

Example of combining with other functions to run query:

# Step 2: Read in SQL files -----------------------------------------------
MI_TIDY_code <- getSQL(MI_TIDY)
WBP_RESIDENTIAL_TIDY_code <- getSQL(WBP_RESIDENTIAL_TIDY)

# Step 3: Run the SQL code ------------------------------------------------
dbGetQuery(con, statement = MI_TIDY_code)
dbGetQuery(con, statement = WBP_RESIDENTIAL_TIDY_code)
cjrace commented 5 months ago
fill_sql_template <- function(code,params){
  # This function takes the input parameters and updates them in a given SQL script.
  for(i in 1:nrow(params)){
    code <- gsub(paste0("<",params[i,]$search_string,">"),params[i,]$value,code)
  }
  return(code)
}