inbo / bird-tracking

🛰🐦 Bird tracking - GPS tracking network for large birds
MIT License
18 stars 7 forks source link

Register datasets with OBIS (in addition to GBIF)? #187

Closed peterdesmet closed 2 years ago

peterdesmet commented 2 years ago

Here's an overview of dataset that might be applicable for registration with OBIS (in addition to GBIF):

peterdesmet commented 2 years ago

@pieterprovoost how do you want me to proceed with these?

Note however that the datasets do not have Aphia ID as scientificNameID, since they are all transformed with an R function that does not take into account mapping scientific names with Aphia IDs.

pieterprovoost commented 2 years ago

@peterdesmet I need to get endorsement from one of our nodes, once that is done I can proceed to harvesting. I agree with the selection above, and I don't expect any issues with taxon matching given the limited taxonomic scope.

peterdesmet commented 2 years ago
  1. Ok, can you keep me posted on the endorsement?
  2. I guess I'll then have to associate the relevant datasets to the OBIS network in the IPT? Is that something I can already do now?
  3. So no need for me to add scientificNameID, it is something you can do on your end (and yes, the scope is limited to about 10 species)
pieterprovoost commented 2 years ago

I received confirmation that these can be published under EurOBIS. It would be good if you could link them to the OBIS network in IPT already, that will ensure that the statistics on our network page are more or less accurate.

No need to add scientificNameID, our automated taxon matching should take care of that. In case of any issues in the future, would you be willing to accept a PR for (optionally) adding WoRMS (or other) identifiers in movepub?

One issue we identified is that we currently require every parentEventID to have a matching eventID within the dataset. For your datasets that would mean that a warning is displayed on the dataset pages in OBIS. However, I don't think this requirement aligns very well with the current definition of parentEventID, so I'll try to implement a fix before I start harvesting.

Thanks again!

peterdesmet commented 2 years ago

Ok, I have added the OBIS network to 7 datasets in the IPT now (4 additional ones are not yet published). It is unclear to me if that information is available without republishing? See e.g. https://ipt.inbo.be/resource?r=lbbg_juvenile

In case of any issues in the future, would you be willing to accept a PR for (optionally) adding WoRMS (or other) identifiers in movepub?

Sure!

every parentEventID to have a matching eventID

Indeed, that won't be the case for these dataset, after careful consideration 😄 (see https://github.com/inbo/movepub/issues/10)

pieterprovoost commented 2 years ago

CURLEW_VLAANDEREN was added successfully, if you agree I'll add the others as well. See https://obis.org/dataset/7ee5747e-f7c5-44ad-9012-925dd60967aa

The network link does not require republishing, the change is made instantly through the GBIF registry API.

peterdesmet commented 2 years ago

Cool! Looks good, you can add the others.

I assume these datasets won't be pushed (and duplicated) to GBIF?

pieterprovoost commented 2 years ago

Correct, they are not pushed to GBIF from our side.

peterdesmet commented 2 years ago

Can you drop the URLs of the datasets on OBIS here once finished? I'm adding them as related identifiers to the source dataset in Zenodo, cf. https://doi.org/10.5281/zenodo.6580939 (sidebar on right)

pieterprovoost commented 2 years ago

It seems I don't have access to:

I can preemptively add them to the system but I would need to be sure that those will be the final shortnames.

peterdesmet commented 2 years ago
pieterprovoost commented 2 years ago

All harvested but I'll need to sort out the issue of Ichthyaetus melanocephalus not being known to WoRMS.

OBIS dataset IPT URL
https://obis.org/dataset/7ee5747e-f7c5-44ad-9012-925dd60967aa https://ipt.inbo.be/resource?r=curlew_vlaanderen
https://obis.org/dataset/a8c7c2d3-533a-4b8f-aff8-a43b8f280a7b https://ipt.inbo.be/resource?r=lbbg_juvenile
https://obis.org/dataset/aac5ca81-638a-4335-9aa7-5c2bda67a362 https://ipt.inbo.be/resource?r=lbbg_zeebrugge
https://obis.org/dataset/cd6933a8-797e-41f4-94f0-fcd969b6794e https://ipt.inbo.be/resource?r=medgull_antwerpen
https://obis.org/dataset/550b4cc1-c40d-4070-a0cb-26e010eca9d4 https://ipt.inbo.be/resource?r=o_assen
https://obis.org/dataset/c633b0f8-90bb-43f2-8680-65ac26dd8400 https://ipt.inbo.be/resource?r=o_vlieland
https://obis.org/dataset/132cfd6e-097d-4ee4-b737-58a596dcbe27 https://ipt.inbo.be/resource?r=o_westerschelde
peterdesmet commented 2 years ago

O_BALGZAND and O_SCHIERMONNIKOOG are now also published. HG_OOSTENDE and O_AMELAND are pending more info from the researchers.

pieterprovoost commented 2 years ago
OBIS dataset IPT URL
https://obis.org/dataset/2c6aa97e-e886-4564-a55a-48e2e506f014 https://ipt.inbo.be/resource?r=o_balgzand
https://obis.org/dataset/01dbc62a-e166-4752-8547-6db4542ec039 https://ipt.inbo.be/resource?r=o_schiermonnikoog
peterdesmet commented 2 years ago

@pieterprovoost what data provider is associated with these datasets? E.g. would INBO appear in the "data providers" section at https://obis.org/node/4bf79a01-65a9-4db6-b37b-18434f26ddfc?

pieterprovoost commented 2 years ago

We are currently only using affiliation info from the EML, which is not present for these datasets. A quick fix would be to enter an organisation for at least one of the creators, but I'll look into ORCID integration as well.

peterdesmet commented 2 years ago

I noticed in the EML specs that organisation is actually reserved for when the agent is not a person and that it therefore shouldn't be used for affiliation. That is why I didn't map that as such in my movepub function, but many people of course do use it for affiliation (myself included for other datasets).

pieterprovoost commented 2 years ago

@peterdesmet I have added (basic) ORCID integration to our metadata parser, see the data providers panel at https://obis.org/dataset/550b4cc1-c40d-4070-a0cb-26e010eca9d4 If no organization has been provided in the EML we'll try to get one from ORCID.

I'll reprocess the other datasets as well.

peterdesmet commented 2 years ago

👏 such cool features!

peterdesmet commented 2 years ago

@pieterprovoost HG_OOSTENDE is now published

pieterprovoost commented 2 years ago

Thanks! https://obis.org/dataset/00cad65a-aa33-4d98-93a2-15155fa963e3

peterdesmet commented 2 years ago

And finally, the last one: O_AMELAND at https://ipt.inbo.be/resource?r=o_ameland

pieterprovoost commented 2 years ago

Done https://obis.org/dataset/3b1da04e-7b8d-4080-ba17-d29909d6d95b