juldebar / R_Metadata

Set of examples to generate different types of metadata with R packages
2 stars 0 forks source link
19110 19115 19139 geonetwork geoserver interoperability iso metadata netcdf ogc opendap postgis postgresql thredds wfs wms

Main releases of the codes are availale on Zenodo with this DOI:

DOI

Implementation of FAIR data management plans with R programming language and usual data sources in scientific context

This repository provides 3 examples of workflows to generate metadata (compliant with OGC standards) from different data sources by using R scripts. Metadata can be pushed directly from R to a CSW server (eg geonetwork) and data managed in a Postgres database can be also published in Geoserver (WMS/WFS) from R.

Each sub-folder contains an example of worfklow dedicated to a specific kind of data source:

These R codes can be executed online

All codes can be executed online in RStudio server provided by D4science infrastructure. If you want to try, please ask for a login (and briefly explain why).

Pre-requisites

Make sure that following pre-requisites are ok:

Installation of R packages on Linux might require the installation of following OS underlying packages (tested on Debian / Ubuntu):

(sudo) apt-get install netcdf-bin libcurl4-openssl-dev  libssl-dev r-cran-ncdf4 libxml2-dev libgdal-dev gdal-bin libgeos-dev udunits-bin libudunits2-dev

Step 1: Execute the default workflow (spreadsheet use case)

Once you have set up the execution environment (see list of OS and R packages in the section above), as a first start, it is recommended to execute the worklow using a google spreadsheet as a (meta)data source since it is the easiest worklow to start with. This will help you to understand how to deal with the json configuration file as well as to understand the logics of all workflows.

Just change few lines in 2 files

Once done with pre-requisites (see previous section):

If it works properly, you should see all datasets described in the spreadsheet containing dublin core metadata elements displayed as metadata sheets published in the geonetwork / CSW server.

Usual Errors

Once there, you can start tuning the workflow to plug other data sources and using other contacts.

Step 2 : Tune the workflow to fit your needs

Once you have been able to execute the workflow with the provided templates and your SDI, you can customize the workflow to fit your specific needs.

Whatever the data source to be plugged, the most important step remain (see details in previous section) :

Plug your data sources (spreadsheets, Postgres database, Thredds server) and your applications

When it works, you can try to execute the same worflow with your spreadsheets and other workflows with additional data sources (Postgres and Thredds / NetCDF files).

Postgres data source use case

In this case, it is required:

(Des)activatation of the different steps

The different steps of the workflow can be (des)activated independantly according to the values "actions" listed" in the json configuration file:

  "actions": {
    "create_metadata_table": false,
    "create_sql_view_for_each_dataset": true,
    "data_wms_wfs": true,
    "data_csv": false,
    "metadata_iso_19115": false,
    "metadata_iso_19110": false,
    "write_metadata_EML": false,
    "main": "write_Dublin_Core_metadata"
  }

NetCDF / NCML (OPeNDAP / Thredds server) use case

Main scripts for metadata creation and publication

The most important scripts for metadata creation are the following

ForTheBadge powered-by-electricity