Closed MattBlissett closed 6 years ago
First draft ready for comments: https://www.gbif.org/faq?question=what-is-an-orphan-dataset
I can add a short text under the registration section with a link to an FAQ item describing it in details.
Something along. This dataset has been adopted by GBIF. [What is an adopted dataset?]
What is the field I should look for @MattBlissett? machineTags
with namespace=orphans.gbif.org AND name=orphanStatus AND value=AWAITING_ADOPTION
? Or how do i recognise datasets that should have this description?
Just a note on terminology: let's agree on when to use the terms orphan(ed), rescue(d), adopt(ion). I thought adoption was only when a publisher agrees to take over the dataset, but looking a the Github Wiki, it seems a bit fuzzy. My suggestion would be:
@ahahn-gbif, @kcopas - comments?
+1 @dnoesgaard
Going too far with the analogy, the datasets we export but host ourselves could be considered fostered.
I agree that adoption is only once a publisher takes over. Morten, you can recognize the RESCUED
value. We may later use extra values like ORPHANED
and ADOPTED
, but only RESCUED
needs anything for the moment.
I can change the hostname of https://orphans.gbif.org/ if that doesn't fit with the comms around this.
I'm starting the export of orphan datasets, beginning with those that have never been crawled since the ingestion process was rewritten before 6 November 2013.
Verbatim data will be exported as a GBIF download (kept forever) then adjusted to be in a Darwin Core Archive suitable for import. These archives will be kept on https://orphans.gbif.org/, until they are "adopted" by a node.
The existing Endpoint will be removed, and a record kept in a machine tag in case the publisher enquires in the future. A new HTTP endpoint will be added. A record of the GBIF download used is in another machine tag.
If a node wants to adopt a dataset, the export script can be rerun (it will use the original GBIF download rather than the adjusted archive for re-import) and a suitable structure for import into an IPT can be produced.
This needs to be explained, so I think we need an FAQ entry which can be linked from the dataset page.
We should make minimal changes to the dataset page; just an additional entry under "Endpoints" explaining that the dataset was orphaned and is now hosted by GBIF.
The first dataset is this one; Morten can see the machine tags in the API call: https://www.gbif.org/dataset/857bce66-f762-11e1-a439-00145eb45e9a — https://api.gbif.org/v1/dataset/857bce66-f762-11e1-a439-00145eb45e9a
Assigning Andrea, Morten and Daniel to work out what comms and website edits need to be done.