ropensci / EDIutils

An API Client for the Environmental Data Initiative Repository
https://docs.ropensci.org/EDIutils/
Other
10 stars 2 forks source link

wrapper function to download all tables in a package #14

Closed scelmendorf closed 3 years ago

scelmendorf commented 4 years ago

@clnsmth this might already exist but I just didn't find it. I think most common use case is something like this wherein someone just wants to download ALL the tabular data into either a list (or possibly stacked should be an option) and have them come out with sensible names.

library (EDIutils)
library (magrittr)
library (data.table)

#assume you used the awesome workflow already to get the package you want
pkgid='knb-lter-nwt.210.1'

#get the ids
entities_id=EDIutils::api_list_data_entities(pkgid) 
#sort the urls
entities_id$url=paste0('https://pasta.lternet.edu/package/data/eml/',
                       gsub('\\.', '/', pkgid), '/', entities_id$identifier)
#also want the names
entities_name=sapply(entities_id$identifier, function(x)
  api_read_data_entity_name('knb-lter-nwt.210.1', x, 
                            environment = 'production'))%>%
  #spaces = hassle to work with, get rid of them
  gsub(' ', '_', .)

#download all
alldata=list()
#should add some EML scraping here to make it read_delim based on the delimiter
#headers TRUE/FALSE, etc
for (k in 1:length(entities_id)){
  alldata[[entities_name[k]]]=read.csv(entities_id$url[k])
}
#probably optional whether you want to bindrows them into a single df or not
#this is an example where it makes sense to but in other instances you might not
finally=alldata%>%rbindlist(., idcol=TRUE, use.names=TRUE)

Does this exist and I am just not finding it? Or would it be worth wrapping the above into a function wtih the package?

clnsmth commented 4 years ago

Thanks for these questions @scelmendorf!

Yes, this functionality exists and is provided by the metajam package (metajam::download_d1_data() and metajam::read_d1_files()) and if I recall correctly, is used frequently in NCEAS data synthesis workflows. Additionally, a group of us are using it to read data packages into a visualization app over at the datapie project.

clnsmth commented 4 years ago

However @scelmendorf, you're more than welcome to contribute this functionality to EDIutils if metajam doesn't meet your use case.

clnsmth commented 3 years ago

@scelmendorf this feature is now implemented in the read_tables() function (see version 1.3.0). Thanks for suggesting this enhancement.