gbif / doc-sensitive-species-best-practices

This document aims to describe current best practices for dealing with primary occurrence data for sensitive species and provide guidance on how to make as freely data available as possible and as protected as necessary.
https://doi.org/10.15468/doc-5jp4-5g10
Other
1 stars 1 forks source link

Random comments from FinBIF #17

Open esko-piirainen opened 3 years ago

esko-piirainen commented 3 years ago

Dear all contributors,

Thanks for a wonderfully written document! I'm providing few insights from point of view of FinBIF (http://species.fi) that you might find interesting.

We have a data warehouse that has about 40M occurrences from ~20 IT systems and ~400 datasets. We have implemented a "securing system" for sensitive species, that has some features that may be quite new compared to any other IT systems that do data generalization. Unfortunately we have only written about this in Finnish and even that does not go into details.

First things that are similar to concepts found in your document and many implementations:

Then some things that are not so common:

Source codes can be found here: https://bitbucket.org/luomus/laji-etl/src/master/WEB-INF/src/main/fi/laji/datawarehouse/etl/models/Securer.java

There is at least one feature in your document that I'll add to our TODO list: We remove fields without leaving placeholders. As you note in your document, it would be better to leave placeholders (localization is an issue since we run a 3 language services, but occurrence data is often only in one language anyway). I'll read the document again with more thought.

Cheers, Esko Piirainen Luomus / FinBIF