DataONEorg / slinky

Slinky, the DataONE Graph Store
Apache License 2.0
4 stars 4 forks source link

Next steps: Slinky as a layer of enhancement on top of static holdings #44

Open amoeba opened 2 years ago

amoeba commented 2 years ago

[I'm filing this not because we're moving on to next steps already but just to file it and let people chime in with ideas]

We can always find ways to improve the metadata we have but most metadata are written once, possibly checked and tweaked by a moderation team, and the left fixed in stone. What if we could extend the ways Slinky already improves metadata (ie co-reference resolution, minting/finding party identifiers) beyond what we're doing now?

I got to thinking about this after one of our recent mobilization calls and a recent example got me here to writing this ticket. Take the metadata record at https://search.dataone.org/view/urn%3Auuid%3A84f4e415-53c3-55e9-bb6d-3ee34419595d. It's a JSON-LD record from NPDC. The abstract starts:

Data from Polarstern cruise PS94 in the Arctic in 2015 with chief scientist Ursula Schauer.

There's a few really key elements to this free text description that we could totally extract into linked data and make for a much richer landing page: (1) Polarstern (2) PS94 (3) Arctic (4) 2015, (5) Ursula Shauer (6) Ursula Shauer as a Chief Scientist and (7) the role of Chief Scientist.

Extracting and linking information like this would be a really nice enhancement for a lot of metadata records, but especially our science-on-schema ones which will tend to be more minimal. We might also think about how we preserve any enhancements in our Data Package exports.

Specific things we could build on top:

mpsaloha commented 2 years ago

I hope people are aware that our MOSAIC ontology already captures much of this information in RDF/OWL semantic format, and that this was done deliberately--

e.g. Polarstern "is a large Research Vessel" that "is the Basis for" Campaign PS122/1 "with research location" Arctic Ocean "having starting location" Tromsø and "Chief Scientist" Markus Rex and "hosts" Event PS122/1_1-100 "made by system" Camera VIS_INFRALAN_01 "that has AWI Sensor information" https://hdl.handle.net/10013/sensor.d19ffa2b-b86e-4f40-b4c6-d50aead7cdba https://hdl.handle.net/10013/sensor.d19ffa2b-b86e-4f40-b4c6-d50aead7cdba "with Sensor" ID 4189

Let me know if this wasn't clear from my various descriptions, or from reviewing the Ontology itself.

Mark

On Fri, Sep 17, 2021 at 1:40 PM Bryce Mecum @.***> wrote:

[I'm filing this not because we're moving on to next steps already but just to file it and let people chime in with ideas]

We can always find ways to improve the metadata we have but most metadata are written once, possibly checked and tweaked by a moderation team, and the left fixed in stone. What if we could extend the ways Slinky already improves metadata (ie co-reference resolution, minting/finding party identifiers) beyond what we're doing now?

I got to thinking about this after one of our recent mobilization calls and a recent example got me here to writing this ticket. Take the metadata record at https://search.dataone.org/view/urn%3Auuid%3A84f4e415-53c3-55e9-bb6d-3ee34419595d. It's a JSON-LD record from NPDC. The abstract starts:

Data from Polarstern cruise PS94 in the Arctic in 2015 with chief scientist Ursula Schauer.

There's a few really key elements to this free text description that we could totally extract into linked data and make for a much richer landing page: (1) Polarstern (2) PS94 (3) Arctic (4) 2015, (5) Ursula Shauer (6) Ursula Shauer as a Chief Scientist and (7) the role of Chief Scientist.

Extracting and linking information like this would be a really nice enhancement for a lot of metadata records, but especially our science-on-schema ones which will tend to be more minimal. We might also think about how we preserve any enhancements in our Data Package exports.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/DataONEorg/slinky/issues/44, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABHLL6IHVYHNTUSXAEV6D3LUCORTXANCNFSM5EIJVYSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.