This ReadMe contains instructions on how to:
There are two ways to add a new data source depending on where data is retrieved from:
createNewDBSource()
createNewFileSource()
Executing one of the functions
createNewDBSource()
or createNewFileSource()
will automatically:
R/02-<datasource>.R
that contains the function to extract the data: extract.<datasource>()
,R/00-databases.R
,.Renviron
file that contains database credentials.The Files R/02-<datasource>.R
for different data sources may contain individual and extensive data
preparations that can be adjusted manually. For details compare e.g. R/02-LiVES.R
, and read the
section Modify An Existing Data Source.
For both ways to add data sources (from database or from file), four mandatory parameters must be specified:
dataSourceName
: (character) name of the new data source, something like "xyDBname", "14CSea", "CIMA", "IntChron", "LiVES". The name of the source must be contained exactly as a column name in the mapping file.datingType
: (character) dating type for the database, e.g. "radiocarbon" or "expert"coordType
: (character) coordinate type of latitude and longitude columns, one of
40.446
or 79.982
,40° 26.767' N
or 79° 58.933' W
,40° 26' 46'' N
or 79° 58' 56'' W
mappingName
: (character) name of the mapping without file extension, e.g. "IsoMemo". The mapping (a .csv file) must be available under "inst/mapping/".Here, database credentials <dbName>, <dbUser>, <dbPassword>, <dbHost>, <dbPort>
and the
<tableName>
must be specified. The credentials are not to be stored on Github, they will not be
stored in any file that will be uploaded to Github. The credentials are only needed for local
development and for testing the database connection.
createNewDBSource(dataSourceName = <datasource>,
datingType = <datingType>,
coordType = <coordType>,
mappingName = <mappingName>,
dbName = <dbName>,
dbUser = <dbUser>,
dbPassword = <dbPassword>,
dbHost = <dbHost>,
dbPort = <dbPort>,
tableName = <tableName>)
Data can be loaded either
inst/extdata'
folder, or<remotePath>
must be given, e.g. "http://www.14sea.org/img/"
Please set <location> = "local"
in the first case, and <location> = "remote"
in the second case.
Please, provide the <filename>
with extension (only *.csv
or *.xlsx
are supported), e.g.
"data.csv"
, "14SEA_Full_Dataset_2017-01-29.xlsx"
Optionally, the following can be specified
.xlsx
files, a <sheetNumber>
as integer value.csv
files, <sep>
for field separator character, and <dec>
for the character used for decimal pointscreateNewFileSource(dataSourceName = <datasource>,
datingType = <datingType>,
coordType = <coordType>,
mappingName = <mappingName>,
fileName = <filename>,
locationType = <location>,
remotePath = <remotePath>,
sheetNumber = 1,
sep = ";",
dec = ",")
Data extraction for all data sources are defined in the files R/02-<datasource>.R
. Within the function extract.<datasource>()
you can retrieve data, modify values as you like. You only need to ensure these points:
extract.<datasource>
. <datasource>
needs to match the entry name
in the file R/00-databases.R
x
. x
holds all configuration from the entry in R/00-databases.R
x$dat
and return x
<mappingId>
that must be available under inst/mapping/<mappingId>.csv
and that is specified in R/00-databases.R
will be processed.A minimal example of the extract function looks like this
extract.testdb <- function(x) {
dat <- mtcars # dummy dataset
x$dat <- dat # assign data to list element x$dat
x # return x
}
Run the following commands in R to install the package locally and run the extract function.
devtools::install() # install package locally
devtools::load_all() # load all functions from package
res <- etlTest()
Inspect the results in test. Data from the nth datasource will be in the element res[[n]]$dat
IMPORTANT: Only 5 rows will be processed during the test! If you want to process all data specify full = TRUE
:
res <- etlTest(full = TRUE)
To test only the n-th datasource execute the function like this
res <- etlTest(databases()[n])
Results will be in the object res[[1]]$dat
Test your code by running
devtools::check()
Code from the main branch will be automatically deployed to the production system on the MPI server (given successful devtools::check()
) and will be on the main version of the API and App.
Respectively, code from the beta branch will be automatically deployed to the beta version of the API and App.
data is returned in JSON format
You can use the following parameters:
Example call:
https://isomemodb.com/api/v1/iso-data?dbsource=LiVES&category=Location&field=site,longitude
Helper endpoints
For the production api use /api instead of /testapi