ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

associatedTaxa for host associations #2597

Open seltmann opened 4 years ago

seltmann commented 4 years ago

I am working with Evin Dunn and Neil Cobb for the Parasite Tracker NSF project. We are making some changes to Symbiota (SCAN) to better support species interactions and dwc:associatedOccurrences. Part of this project, we are integrating data from many platforms (Arctos, Specify, KEmu, etc.) One thing I noticed is that Arctos does not fill out the host association information in dwc:associatedTaxa, but does have this information (verbatimHostId) in dwc:dynamicProperties.

My request/question is if you would consider also adding the information to dwc:associatedTaxa? It is the darwin core field that people will most likely look to find associated information, and the data can be included as a key:value, to include the relationship and verbatim host name.

Thank you for your consideration, Katja

Issue Documentation is http://handbook.arctosdb.org/how_to/How-to-Use-Issues-in-Arctos.html

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe what you're trying to accomplish An clear and concise overview of the goals; why are you asking for this?

Describe the solution you'd like How might we accomplish your goals?

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

Priority Please assign a priority-label.

campmlc commented 4 years ago

Hi Katja, We can certainly look into this. Also, please be aware that Arctos primarily makes associated taxa available through relationships,which go to GBIF as associated occurrences, as in https://www.gbif.org/occurrence/1145354227.

On Fri, Apr 17, 2020 at 11:06 AM Katja Seltmann notifications@github.com wrote:

  • UNM-IT Warning:* This message was sent from outside of the LoboMail system. Do not click on links or open attachments unless you are sure the content is safe. (2.3)

I am working with Evin Dunn and Neil Cobb for the Parasite Tracker NSF project. We are making some changes to Symbiota (SCAN) to better support species interactions and dwc:associatedOccurrences. Part of this project, we are integrating data from many platforms (Arctos, Specify, KEmu, etc.) One thing I noticed is that Arctos does not fill out the host association information in dwc:associatedTaxa, but does have this information (verbatimHostId) in dwc:dynamicProperties.

My request/question is if you would consider also adding the information to dwc:associatedTaxa? It is the darwin core field that people will most likely look to find associated information, and the data can be included as a key:value, to include the relationship and verbatim host name.

Thank you for your consideration, Katja

Issue Documentation is http://handbook.arctosdb.org/how_to/How-to-Use-Issues-in-Arctos.html

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe what you're trying to accomplish An clear and concise overview of the goals; why are you asking for this?

Describe the solution you'd like How might we accomplish your goals?

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

Priority Please assign a priority-label.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/2597, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBA7KRBNJ3T7YAMHKODRNCEADANCNFSM4MK42VWQ .

seltmann commented 4 years ago

Hi @campmlc ! I am aware of that and I am not advocating for replacing any present functionality or pipeline. I do see an opportunity here for the data to be easier to discover by a larger community of folks, especially since the information already exists in other darwin core fields. Thanks for looking into this.

campmlc commented 4 years ago

That certainly seems reasonable. We would ideally want to make both discoverable, with the relationship info taking priority, as the verbatim host id is not standardized or taxonomically updated, may include mispellings, outdated or common names, and is much less reliable as a source of association data than the info coming from relationships/associated occurrences. Also, not everything that has an associated occurrence has a recorded verbatim host id. Any way to flag one source as preferred?

On Fri, Apr 17, 2020 at 11:38 AM Katja Seltmann notifications@github.com wrote:

  • UNM-IT Warning:* This message was sent from outside of the LoboMail system. Do not click on links or open attachments unless you are sure the content is safe. (2.3)

Hi @campmlc https://github.com/campmlc ! I am aware of that and I am not advocating for replacing any present functionality or pipeline. I do see an opportunity here for the data to be easier to discover by a larger community of folks, especially since the information already exists in other darwin core fields. Thanks for looking into this.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/2597#issuecomment-615375372, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBB6M4KAKFS33PA7VK3RNCHZHANCNFSM4MK42VWQ .

campmlc commented 4 years ago

@dustymc @dbloom

seltmann commented 4 years ago

Well, it is of course up to what ya'll think is best, but I was thinking:

relationship:verbatimHostName in the dwc:associatedTaxa field. Everything else in the Arctos records would remain exactly the same.

campmlc commented 4 years ago

Can you describe how the TPT would deal with a situation where there is conflict between the verbatim host ID and the relationship to a cataloged associated occurrence? Would it be possible to prioritize the latter?

On Fri, Apr 17, 2020 at 12:01 PM Katja Seltmann notifications@github.com wrote:

  • UNM-IT Warning:* This message was sent from outside of the LoboMail system. Do not click on links or open attachments unless you are sure the content is safe. (2.3)

Well, it is of course up to what ya'll think is best, but I was thinking:

relationship:verbatimHostName in the dwc:associatedTaxa field. Everything else in the Arctos records would remain exactly the same.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/2597#issuecomment-615385905, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBEXQVOY65RUDEXKFTTRNCKPRANCNFSM4MK42VWQ .

seltmann commented 4 years ago

We do not prioritize things, rather it would come into SCAN in the associatedTaxa field, and associatedOccurrence would exist in the associatedOccurrence field.

I attached screenshots and the record is linked below from an MPM arthropod record and how it is being included into SCAN/TPT

Screen Shot 2020-04-17 at 11 56 24 AM

Screen Shot 2020-04-17 at 11 54 52 AM

Example record: https://scan-bugs.org/portal/collections/editor/occurrenceeditor.php?csmode=0&occindex=10&occid=41632648&collid=235)

campmlc commented 4 years ago

That's helpful, thanks!

On Fri, Apr 17, 2020 at 1:07 PM Katja Seltmann notifications@github.com wrote:

  • UNM-IT Warning:* This message was sent from outside of the LoboMail system. Do not click on links or open attachments unless you are sure the content is safe. (2.3)

We do not prioritize things, rather it would come into SCAN in the associatedTaxa field, and associatedOccurrence would exist in the associatedOccurrence field.

I attached screenshots and the record is linked below from an MPM arthropod record and how it is being included into SCAN/TPT

[image: Screen Shot 2020-04-17 at 11 56 24 AM] https://user-images.githubusercontent.com/1044474/79604835-af761f00-80a3-11ea-8e5d-54201a27308f.png

[image: Screen Shot 2020-04-17 at 11 54 52 AM] https://user-images.githubusercontent.com/1044474/79604845-b3a23c80-80a3-11ea-97ed-95c3d81418e5.png

Example record: https://scan-bugs.org/portal/collections/editor/occurrenceeditor.php?csmode=0&occindex=10&occid=41632648&collid=235 )

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/2597#issuecomment-615414634, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBHZKOFBP5C22L45N7LRNCSFXANCNFSM4MK42VWQ .

campmlc commented 4 years ago

Following up on this, @dustymc @dbloom what do we need to do to publish values for the verbatim host ID attribute to dwc:associatedTaxa? These data are currently published to dwc:dynamicProperties.

Jegelewicz commented 2 years ago

Returning to this since recordedByID is currently a fail.

@dbloom says adding new dwc fields shouln't break anything, and this seems like it might be kinda easy?

campmlc commented 2 years ago

This would be good to prioritize - for NSF funded collaborations with SCAN and TPT.

Jegelewicz commented 2 years ago

Request:

publish values in the verbatim host ID attribute to dwc:associatedTaxa. These data are currently published to dwc:dynamicProperties.

Jegelewicz commented 2 years ago

BUT, other things belong here too?

ASSOCIATED_SPECIES associatedTaxa,

is currently what appears there. How do we make it clear what is host and what is just nearby?

Suggest leaving associated_species as is, but adding verbatim host attribute along with all of it's metadata. image

dustymc commented 2 years ago

This does not seem correct to me. We map ASSOCIATED_SPECIES to dwc:associatedTaxa, and I believe the intent of both is "growing near oaks" or "hanging out with ducks" - something a lot less specific than even the verbatim host data.

Jegelewicz commented 2 years ago

It isn't correct, but is the best most users of our aggregated data can do and we want to be team players, so....

Jegelewicz commented 7 months ago

This needs to be a concatenation of associated species and [verbatim host ID](https://arctos.database.museum/info/ctDocumentation.cfm?table=ctattribute_type#verbatim_host_id?

Added to working document.