paulrougieux / FAOSTATpackage

The full FAOSTAT package including build
1 stars 1 forks source link

Updated FAOSearch function only #6

Closed KrishnaTO closed 4 years ago

KrishnaTO commented 4 years ago

Apologies for dropping the ball before with the former pull request. Here is only the single change for the FAOSearch function.

It works by retrieving the latest dataset catalogue from FAOSTAT, including FileLink to download zipped table

Demo:

Get full FAOSTAT catalogue, including topic attributes, full description and download link, sorted by latest updated dataset. fao <- FAOsearch(full=T, , latest = TRUE)

Get datasets with keyword in name: FAOsearch(dataset = "Crop")

Get download link for dataset by it's code:

QI <- FAOsearch(code = "QI", full = TRUE)
download.file(QI$FileLocation, "QI.zip")

Can also automate to get the full table via convenience function:

fao <- FAOsearch(full=T)
# Retrieved: https://github.com/muuankarski/faobulk/blob/master/R/get_data.R
get_data <- function(DatasetCode = "QP"){
  datas <- fao
  urli <- datas[datas$DatasetCode == DatasetCode,]$FileLocation
  fly <- tempfile(fileext = ".zip")
  download.file(url = urli, destfile = fly)
  dat <- readr::read_csv(fly)
  names(dat) <- tolower(sub(" ", "_", names(dat)))
  return(dat)
} 

population <- get_data("OA")

paulrougieux commented 4 years ago

Added some documentation in ce65603 please try to use the gitlab repository if you don't mind, other wise github is also fine.

paulrougieux commented 4 years ago

Thanks

Can also automate to get the full table via convenience function:

I only see this now. I have implemented a very similar function :


#' @rdname download_faostat_bulk
#' @param code character dataset code
#' @return data frame of FAOSTAT data
#' @export
get_faostat_bulk <- function(code, data_folder){
    # Load information about the given dataset code
    metadata <- FAOsearch(code = code)
    # Use the result of the search to download the data and assign it to a data frame
    download_faostat_bulk(url_bulk = metadata$filelocation, data_folder = data_folder)
    output <- read_faostat_bulk(file.path(data_folder, basename(metadata$filelocation)))
    return(output)
}

By the way I prefer to convert all column names to lower case in the new functions in R/faostat_bulk_download.R. I kind of prefer lower case names, because they are easier to remember and less messy. See for example the full lower case, full upper case, and mixed lowercase/uppercase variables used in this old code from the vignette:

con.df = data.frame(STS_ID = c("arableLandPC", "arableLandShareOfTotal",
                               "totalPopulationGeoGR", "totalPopulationLsGR",
                               "totalPopulationInd", "totalPopulationCh"),
                    STS_ID_CONSTR1 = c(rep("arableLand", 2),
                                       rep("totalPopulation", 4)),
                    STS_ID_CONSTR2 = c("totalPopulation", NA, NA, NA, NA, NA),
                    STS_ID_WEIGHT = rep("totalPopulation", 6),
                    CONSTRUCTION_TYPE = c("share", "share", "growth", "growth",
                                          "index", "change"),
                    GROWTH_RATE_FREQ = c(NA, NA, 10, 10, NA, 1),
                    GROWTH_TYPE = c(NA, NA, "geo", "ls", NA, NA),
                    BASE_YEAR = c(NA, NA, NA, NA, 2000, NA),
                    AGGREGATION = rep("weighted.mean", 6),
                    THRESHOLD_PROP = rep(60, 6),
                    stringsAsFactors = FALSE)

All lower case is in the style of the tidyverse kind of packages. some people might not like it though if they have existing code based on the FAOSTAT package which they would like to revive. I'm not certain if I should impose this on this package or not.

KrishnaTO commented 4 years ago

Definitely believe lowercase should be encouraged.