iodepo / odis-arch

Development of the Ocean Data and Information System (ODIS) architecture
https://book.oceaninfohub.org/
26 stars 16 forks source link

validate OBIS relay of GBIF records to ODIS #426

Open jmckenna opened 1 month ago

jmckenna commented 1 month ago

Initial review looks good (valid JSON-LD of type dataset embedded), at least ready to index into ODIS & provide feedback.

related to: https://github.com/iodepo/odis-in/issues/4 (create Taxon pattern for ODIS)

cc @timrobertson100 @pbuttigieg

pbuttigieg commented 1 month ago

Thanks @jmckenna

Tagging @pieterprovoost too.

We'll explore some potential issues with the Type modelling and content in this issue, but overall the entries look good off the bat.

One major objective is to use additionalProperty properties and PropertyValue types to embed any important (i.e. high potential to boost discoverability) Darwin Core properties that aren't already covered by more generic schema.org types and properties.

If OBIS and GBIF can do this in a simiar way, we're laying a strong foundation for biodiversity data flows across ODIS and related federations.

This will also be relevant for INSDC flows, EOV/EBV data hubs, and TreatmentBank (https://github.com/iodepo/odis-arch/issues/407), among several others.

pbuttigieg commented 1 month ago

@fils @jmckenna this may require a system-system interoperability bridge:

timrobertson100 commented 1 month ago

If GBIF can prep a marine/costal subset of their records in a sitemap, we can use the Node workflow

If it helps in your planning - realistically, this is unlikely to happen anytime soon simply due to workload

I wonder what the goal is here. I would expect any subsetting of GBIF to marine-relevant datasets to arrive at (give or take*) the same datasets that you'd find in OBIS. It may not be well known, but GBIF and OBIS operate with a common network of data repositories (IPT installations) where datasets are discoverable through both search "portals"; OBIS indexing for marine-specific things. If the goal is that the marine-related datasets registered in GBIF are included, I think OBIS is a very sensible route to that aim.

I hope this helps.

*we're working on catching the outliers to ensure all GBIF-registered marine data is discoverable in OBIS

pieterprovoost commented 1 month ago

As the OBIS and GBIF holdings are currently only partially intersecting, relying solely on OBIS will result in missing marine GBIF datasets in ODIS. However, we have procedures for identifying marine datasets in GBIF, so we could produce a marine GBIF sitemap for the time being.

pbuttigieg commented 1 month ago

@pieterprovoost that would be great, and another value added by OBIS. It also is good to have the control of this resource in the hands of the marine biodiversity data domain, who are best placed to decide on its form and content.

In the ODIS and WorldFAIR/CDIF model, this would be good example of elements in a digital ecosystem augmenting one another.

@timrobertson100 if OBIS can handle this for you, then the systems in ODIS Federation will interoperate with GBIF holdings. So the WorldFAIR mission is accomplished through a brokerage model (OBIS being the broker). I'll update WorldFAIR deliverable 11.3 to reflect this.

That being said, I strongly encourage GBIF to share improved JSON-LD/schema.org to be more globally interoperable. Biodiversity data is still siloed to many other domains that don't speak DwC. Fortunately, OBIS has already figured out most of the implementation (just need to embed DwC fields into additional property fields)

pbuttigieg commented 3 weeks ago

@pieterprovoost - I'm retitling this issue to reflect the new status, and assigning you.