ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

MSB:Host - ARK allows direct link to Smithsonian #6356

Closed Jegelewicz closed 1 year ago

Jegelewicz commented 1 year ago

@campmlc working on the identifiers I am wondering about this.

https://arctos.database.museum/guid/MSB:Host:11676

includes an ARK which links directly to the Smithsonian Mammal record. This means the part on the MSB:Host record is unnecessary as the part is clearly in the Smithsonian collection.

http://n2t.net/ark:/65665/350fc7c92-14e5-4133-b300-4f07ccd08d3e

This host record isn't linked to any parasites and as far as I can see it never will be (not examined, not detected) but if it were, I would suggest adding the ARK as the host link on the parasite record - no need for the Arctos record which is just a duplicate of the Smithsonian data.

Does that make sense? You guys could reduce the need to manage all of the host observations for anything with an ARK at the Smithsonian which allows you to link directly to the Smithsonian data.

THIS could be a great paper/presentation on how stable identifiers can actually work. Because Smithsonian has added ARKs, we can link directly to their records (perhaps even including media tags?). The problem is that they (apparently) cannot reciprocate. We could also clean up a lot of unnecessary observation records, reducing your need to manage data for stuff you don't actually have.

I'm happy to help formulate a plan and get this fixed up.

campmlc commented 1 year ago

Adding the ARK was a proof of concept, and it is certainly worth developing. It would be great if we could set up a time/committee to figure out ways to deal with and link these identifiers. Let's think about this during or after next week's meeting. Using these features and making them discoverable also depends on fixing the UI issues that are priority for our institution and others.

On Fri, May 26, 2023 at 11:01 AM Teresa Mayfield-Meyer < @.***> wrote:

  • [EXTERNAL]*

@campmlc https://github.com/campmlc working on the identifiers I am wondering about this.

https://arctos.database.museum/guid/MSB:Host:11676

includes an ARK which links directly to the Smithsonian Mammal record. This means the part on the MSB:Host record is unnecessary as the part is clearly in the Smithsonian collection.

http://n2t.net/ark:/65665/350fc7c92-14e5-4133-b300-4f07ccd08d3e

This host record isn't linked to any parasites and as far as I can see it never will be (not examined, not detected) but if it were, I would suggest adding the ARK as the host link on the parasite record - no need for the Arctos record which is just a duplicate of the Smithsonian data.

Does that make sense? You guys could reduce the need to manage all of the host observations for anything with an ARK at the Smithsonian which allows you to link directly to the Smithsonian data.

THIS could be a great paper/presentation on how stable identifiers can actually work. Because Smithsonian has added ARKs, we can link directly to their records (perhaps even including media tags?). The problem is that they (apparently) cannot reciprocate. We could also clean up a lot of unnecessary observation records, reducing your need to manage data for stuff you don't actually have.

I'm happy to help formulate a plan and get this fixed up.

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/6356, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBC5MDFWVN6F3P7Y4WTXIDOW7ANCNFSM6AAAAAAYQPFOJE . You are receiving this because you were mentioned.Message ID: @.***>

dustymc commented 1 year ago

Does that make sense?

Yes but no??

he problem is that they (apparently) cannot reciprocate.

Ain't our problem though, is it? A user getting data from Arctos gets the full picture, a user getting data from USNM does not - we've done what we can and don't have the resources to do it for them too.

USNM shares with GBIF, I'm pretty sure they include those fabulous identifiers in that (but I can't find them, https://www.gbif.org/occurrence/search?q=http:~2F~2Fn2t.net~2Fark:~2F65665~2F350fc7c92-14e5-4133-b300-4f07ccd08d3e ??), Arctos also shares actionable identifiers with GBIF, GBIF has an API and some folks who seem keen to make awesome things happen - maybe this is an opportunity for an elegant solution to a longstanding problem, and one which extends well beyond Arctos.

Jegelewicz commented 1 year ago

One of us needs to be at the GBIF workshop on Monday to pass this along.

USNM shares with GBIF, I'm pretty sure they include those fabulous identifiers in that (but I can't find them, https://www.gbif.org/occurrence/search?q=http:~2F~2Fn2t.net~2Fark:~2F65665~2F350fc7c92-14e5-4133-b300-4f07ccd08d3e ??

They share it as an occurrence id, so not sure why you got nothing?

image

Here is the record at GBIF

https://www.gbif.org/occurrence/1319033595

Which I found by searching the USNM extant collection plus the catalog number (and had to pick it out among several possibilities). The fact that you can't search all occurrences for that url and get to the record is concerning....

dustymc commented 1 year ago

Cool, thanks, they do share good IDs so this is a "just UI" (or "just API" - or possibly "I just didn't RTFM"...) problem that GBIF could solve. IDK how far that'd go now (some ways!), but what we can share via GUM is capable of answering some very deep multi-record questions. That seems fundable, maybe Jorrit would want to be involved (I think it's just a deeper dig in the direction he's already pointed)?

campmlc commented 1 year ago

Yes, another reason for Teresa to attend the Tucson conference in two weeks. . .

On Fri, May 26, 2023, 12:09 PM dustymc @.***> wrote:

  • [EXTERNAL]*

Cool, thanks, they do share good IDs so this is a "just UI" (or "just API"

  • or possibly "I just didn't RTFM"...) problem that GBIF could solve. IDK how far that'd go now (some ways!), but what we can share via GUM is capable of answering some very deep multi-record questions. That seems fundable, maybe Jorrit would want to be involved (I think it's just a deeper dig in the direction he's already pointed)?

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/6356#issuecomment-1564748068, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBH4ZRFAWBX4SRDWXU3XIDWVHANCNFSM6AAAAAAYQPFOJE . You are receiving this because you were mentioned.Message ID: @.***>

dustymc commented 1 year ago

Done?

campmlc commented 1 year ago

Reopening to address this issue:

This host record isn't linked to any parasites and as far as I can see it never will be (not examined, not detected) but if it were, I would suggest adding the ARK as the host link on the parasite record - no need for the Arctos record which is just a duplicate of the Smithsonian data.

Does that make sense? You guys could reduce the need to manage all of the host observations for anything with an ARK at the Smithsonian which allows you to link directly to the Smithsonian data.

No, we need the host record especially for cases like these where the host is in an external, non-Arctos repository. Having the host record allows for parasite/host linkages within Arctos to capture host attributes such as examined/detected for prevalence, attaching media such as the Rausch ledger, and most importantly and above all, enabling searches within Arctos on host/parasite relationships based on higher classification and geography - none of which is possible with just a link to an external resource. This is explained in detail here #6249 , and I would very much like to have a discussion so that everyone truly understands the model we developed when we created the parasite and host collections in Arctos with NSF funding.

Jegelewicz commented 1 year ago

I don't understand why this needs to be open. MSB is doing what they are doing and the comment above was merely a suggestion.

dustymc commented 1 year ago

don't understand why this needs to be open

This should not be an open Issue unless there's some definable action item (or something that might become one). If that exists then it needs spelled out. If it does not, this is an impediment to organization and should be closed.

campmlc commented 1 year ago

I reopened the issue because we need to document the broader issue of how to deal with linkages of parasites and hosts across multiple external collections. Related to #6249 and #5135.