Add a function for retrieving data by resource id

Moohan commented 3 years ago

Also associated helper functions

csillasch commented 3 years ago

Sorry for the wait on this again. Had a go at combining this with the data dump if rows are > 99999 or not set. Am putting it here if you want to make changes, but can open a separate branch for this if that`s easier? Code would look something like this:

opendata_get_full_resource <- function(res_id, rows = NULL){

  if (!opendata_check_res_id(res_id)) {
    stop(glue::glue("The resource ID supplied ('{res_id}') is invalid"))
  }

  #set ckan connection
  ckan_url <- "https://www.opendata.nhs.scot"

  #set resource id-s to use
  res_id <- res_id

  if (isTRUE(is.null(rows) || rows > 99999)) {

    #extract all data

  data <- readr::read_csv(glue::glue("{ckan_url}/datastore/dump/{res_id}?bom=true"))%>%
    dplyr::select(-"_id")

  return(data)

  }

  else {
    query <- list(
    id = res_id,
    limit = rows
  )

  url <- httr::modify_url(opendata_ds_search_url(),
                          query = query
  )

  ua <- opendata_ua()

  response <- httr::GET(url = url, user_agent = ua)

  httr::stop_for_status(response)

  stopifnot(httr::http_type(response) == "application/json")

  parsed <- httr::content(response, "text") %>%
    jsonlite::fromJSON()

  data <- parsed$result$records %>%
    tibble::as_tibble()

  return(data)

  }

}

Moohan commented 3 years ago

I think this is done now? It passes all checks and tests so unless we want to add more - e.g. some more tests around large resources and different options for row_num?

Public-Health-Scotland / phsmethods

Add a function for retrieving data by resource id #58