This repository contains the functionality to create and standardize the Global Register of Introduced and Invasive Species - Belgium to a Darwin Core checklist that can be harvested by GBIF.
This unified checklist is the result of two open and reproducible data pipelines developed for the TrIAS project (http://trias-project.be). In the data publication pipeline, we use the Checklist recipe to standardize and publish a selection of authoritative species checklists as Darwin Core Archives to GBIF. Predominantly, these checklists record the presence of alien species in Belgium for a specific taxon group or habitat and are maintained by their respective authors. In the data processing pipeline, we extract all Belgian non-native taxa from these checklists and unify their taxonomy using the GBIF Backbone Taxonomy. This automated process is implemented and documented at https://trias-project.github.io/unified-checklist/ The sources used for the unified checklist are:
See https://trias-project.github.io/unified-checklist/
The repository structure is based on Cookiecutter Data Science and the Checklist recipe. Files and directories indicated with GENERATED
should not be edited manually.
├── README.md : Description of this repository
├── LICENSE : Repository license
├── unified-checklist.Rproj : RStudio project file
├── .gitignore : Files and directories to be ignored by git
│
├── data
│ ├── raw : Source data as downloaded from GBIF GENERATED
│ ├── interim : Unified data GENERATED
│ └── processed : Darwin Core output of mapping script GENERATED
│
├── references
│ └── verification.tsv : Verification file (for synonyms). Generated by
│ 3_verify_taxa.Rmd and then manually annotated
│
├── docs : Repository website GENERATED
│
├── index.Rmd : Website homepage
├── _bookdown.yml : Settings to build website in docs/
│
└── src
├── 1_get_taxa.Rmd : Script to get taxa from checklists
├── 2_get_information.Rmd : Script to get related information
├── 3_verify_taxa.Rmd : Script to verify taxa
├── 4_unify_taxa.Rmd : Script to unify taxa
├── 5_unify_information.Rmd : Script to unify related information
├── 6_dwc_mapping.Rmd : Script to map to Darwin Core
└── 7_griis_mapping.Rmd : Script to map to create Excel file for GRIIS
index.Rmd
R Markdown file in RStudioBuild > Build Book
to generate the processed data and build the website in docs/
To publish an update of the dataset:
Source data
: upload the newly generated data files from data/processed
Darwin Core mappings
: does not require updates, unless terms were added/removed in the pipelineMetadata
: does not require updates, except for:
Basic metadata
: in description, check if number of taxa (2.500+) still appliesTaxonomic coverage
: in description, update numbers per kingdom based on new dataTemporal coverage
: update End date
if need bePublish
, add a short description and publishMIT License for the code and documentation in this repository. The included data is released under another license.