ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

Code Table Request - New taxonomy source: common names #7832

Closed Jegelewicz closed 2 months ago

Jegelewicz commented 3 months ago

Initial Request

Goal

Describe what you're trying to accomplish. This is the only necessary step to start this process. The Committee is available to assist with all other steps. Please clearly indicate any uncertainty or desired guidance if you proceed beyond this step.

Better manage and use vernacular or common names that are related to taxon names in Arctos #5025 . Also, development of a path for managing offensive common names #6040.

Context

Describe why this new value is necessary and existing values are not.

Currently common names are associated directly with taxon names. This makes them less useful when a particular collection would like a particular name associated with their records and they are more difficult to manage in a bulk format. @ArctosDB/diversity-and-inclusion thinks setting up a common name source might be a good idea and asked @Jegelewicz to set up in test.

Table

Code Tables are http://arctos.database.museum/info/ctDocumentation.cfm. Link to the specific table or value. This may involve multiple tables and will control datatype for Attributes. OtherID requests require BaseURL (and example) or explanation. Please ask for assistance if unsure.

https://arctos.database.museum/info/ctDocumentation.cfm?table=cttaxonomy_source

Proposed Value

Proposed new value. This should be clear and compatible with similar values in the relevant table and across Arctos.

Arctos common names

Proposed Definition

Clear, complete, non-collection-type-specific functional definition of the value. Avoid discipline-specific terminology if possible, include parenthetically if unavoidable.

A source for common names of taxa or organism (also known as a vernacular name, English name, colloquial name, country name, popular name, or farmer's name) which are names that are based on the normal language of everyday life; and are often contrasted with the scientific name for the same organism.

https://en.wikipedia.org/wiki/Common_name

Collection type

_Some code tables contain collection-type-specific values. collection_cde may be found from https://arctos.database.museum/home.cfm_

N/A

Attribute Extras

Attribute data type

If the request is for an attribute, what values will be allowed? free-text, categorical, or number+units depending upon the attribute (TBA)

N/A

Attribute controlled values

If the values are categorical (to be controlled by a code table), add a link to the appropriate code table. If a new table or set of values is needed, please elaborate.

N/A

Attribute units

if numerical values should be accompanied by units, provide a link to the appropriate units table.

N/A

Part preservation attribute affect on "tissueness"

if a new part preservation is requested, please add the affect it would have on "tissueness": No Influence, Allows, or Denies

N/A

Priority

Please describe the urgency and/or choose a priority-label to the right. You should expect a response within two working days, and may utilize Arctos Contacts if you feel response is lacking.

Example Data

Requests with clarifying sample data are generally much easier to understand and prioritize. Please attach or link to any representative data, in any form or format, which might help clarify the request.

Available for Public View

Most data are by default publicly available. Describe any necessary access restrictions.

yes

Helpful Actions

@ArctosDB/arctos-code-table-administrators @ArctosDB/diversity-and-inclusion @ArctosDB/taxonomy

Approval

All of the following must be checked before this may proceed.

_The How-To Document should be followed. Pay particular attention to terminology (with emphasis on consistency) and documentation (with emphasis on functionality). No person should act in multiple roles; the submitter cannot also serve as a Code Table Administrator, for example._

Rejection

If you believe this request should not proceed, explain why here. Suggest any changes that would make the change acceptable, alternate (usually existing) paths to the same goals, etc.

  1. Can a suitable solution be found here? If not, proceed to (2)
  2. Can a suitable solution be found by Code Table Committee discussion? If not, proceed to (3)
  3. Take the discussion to a monthly Arctos Working Group meeting for final resolution.

Implementation

Once all of the Approval Checklist is appropriately checked and there are no Rejection comments, or in special circumstances by decree of the Arctos Working Group, the change may be made.

Close this Issue.

DO NOT modify Arctos Authorities in any way before all points in this Issue have been fully addressed; data loss may result.

Special Exemptions

In very specific cases and by prior approval of The Committee, the approval process may be skipped, and implementation requirements may be slightly altered. Please note here if you are proceeding under one of these use cases.

  1. Adding an existing term to additional collection types may proceed immediately and without discussion, but doing so may also subject users to future cleanup efforts. If time allows, please review the term and definition as part of this step.
  2. The Committee may grant special access on particular tables to particular users. This should be exercised with great caution only after several smooth test cases, and generally limited to "taxonomy-like" data such as International Commission on Stratigraphy terminology.
Jegelewicz commented 3 months ago

I was going to set this up in test but it appears test is down? @dustymc

The connection has timed out

An error occurred during a connection to arctos-test.tacc.utexas.edu.

The site could be temporarily unavailable or too busy. Try again in a few moments.
If you are unable to load any pages, check your computer’s network connection.
If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the web.
dustymc commented 3 months ago
  1. I don't like the name, there have been other requests for more specific similar data, much of the current data could be split out to better align with homonyms, etc., etc., suggest 'Arctos common names' to align with existing values.
  2. The definition should make the source more clear (eg random stuff we've scooped up from wherever we've found it).
  3. Propping this up beside the current (clearly limited) mechanism is more denormaliation that I or anyone else can or should be able to stand, this is a much more powerful pathway, on approval I'll just move the existing data into this source and shut down the legacy name-linked system (which will also save significant CPU).

Test is no longer publicly available, you'll need a tunnel or VPN, @mkoo can point you in the right direction.

Jegelewicz commented 3 months ago

on approval I'll just move the existing data into this source and shut down the legacy name-linked system (which will also save significant CPU).

That is an expected part of this if it moves forward.

Nicole-Ridgwell-NMMNHS commented 3 months ago

I don't work with common names much, but I do have a couple of questions.

1) Would you have to start individually assigning common name identifications to each specimen, or would these somehow still be linked to taxonomy? 2) Could you end up running into issues in data entry where a student would enter a common name instead of the taxonomic name?

Jegelewicz commented 3 months ago
  1. They would be linked to taxonomy, just like all the other classifications.
  2. NO these would not be names you could pick via identifications, they would be "classifications" associated with taxon names.

Once I can access test, I will set up an example.

dustymc commented 3 months ago

Would you have to start individually assigning common name identifications to each specimen

No.

or would these somehow still be linked to taxonomy?

Yes (more below).

student would enter a common name instead of the taxonomic name

It'll just error (unless the common name is also a taxon name) - exactly the same thing that'll happen now if a student enters some random taxon term for some reason.

The More Below

There are two ways this can work, and they can overlap to any degree.

  1. What's proposed here - common names in a dedicated classification source that probably nobody will ever prefer - will be a completely lateral move. The common names will be associated with a taxon (same as now), they can be used to find stuff in the 'any...' options (same as now), or they can be specifically searched (same as now, but without the extra clutter/field). Same data, same functionality, a slightly simpler structure, a slightly cheaper query. I don't see any way this won't proceed, nobody's going to have the resources (or interest) to do anything else, it'll just make everything easier for everyone.
  2. You can add common names to your preferred classifications, which would bring them in a bit closer to your records. This would (sorta, probably) be less confusing when (hemi)homonyms are involved, it would display them on your record pages, make them available as record data in the API, and basically still do everything (1) does. You can do this now by managing whatever classification(s) you prefer, no CT request of any kind is necessary (but please - as always - play nice if you share those Sources).
Jegelewicz commented 3 months ago

@dustymc would you care if I just made this source in production? I don't think it's going to hurt anything and we can always remove it if it doesn't do what we want.

dustymc commented 3 months ago

@Jegelewicz that seems completely reasonable to me (and one can - sorta - create sources by sending data to globalnames, a lower bar for these might make some sense anyway).

Jegelewicz commented 3 months ago

Here are a few examples:

https://arctos.database.museum/name/Kalmia#ArctosCommonNames

https://arctos.database.museum/name/Turdus%20migratorius#ArctosCommonNames

Right now, this will do absolutely nothing except facilitate search unless someone decides to select it in their list of taxonomy sources, but I am wondering if we can use this source to allow CHAS to select the common name they want to display (see the entry that has CHAS:Herb or CHAS:Bird) @dustymc thoughts?

dustymc commented 3 months ago

select the common name they want to display

See (2) in https://github.com/ArctosDB/arctos/issues/7832#issuecomment-2150787931

Jegelewicz commented 3 months ago

You can add common names to your preferred classifications, which would bring them in a bit closer to your records. This would (sorta, probably) be less confusing when (hemi)homonyms are involved, it would display them on your record pages, make them available as record data in the API, and basically still do everything (1) does. You can do this now by managing whatever classification(s) you prefer, no CT request of any kind is necessary (but please - as always - play nice if you share those Sources).

This is really not the workable solution you think it is. It means that we end up with everyone managing their own taxonomy just to get a preferred common name and I don't think we need to go that far. I was thinking that we could use the common name classification in ways beyond discovery.

https://arctos.database.museum/name/Turdus%20migratorius#ArctosCommonNames

Could we set up an identification attribute that pulls from the common name classification? The one above has a class term type = CHAS:Bird with the value American Robin which could show up in this classification as an identification attribute

ID attribute type ID attribute value
collection preferred common name American Robin

If this is too resource intensive, are there other ways we could so this? I'm looking to keep common names out of the taxonomic sources that include scientific classifications and provide collections with an ability to use preferred common names when needed.

Also - should the various common names go in the no_class or class section of the classification?

dustymc commented 3 months ago

everyone managing their own taxonomy just to get a preferred common name

Just like anyone who wants a "preferred subfamily" would do!

identification attribute that pulls from the common name classification?

This sounds like a very complicated way to produce inconsistent data; skip the complexity and add some attributes if that's the goal.

are there other ways we could so this?

I'm pretty baffled by this whole thing. I'm hearing "this is an important part of the metadata" and "we don't want to treat this like we treat every other important part of the metadata." If common names are something like families, treat them like it and manage them with all other metadata. If they're attributes, treat them like that and don't over-complicate things. If they're something else, spell that out and we'll find an appropriate model for it.

no_class or class

The model doesn't care. If common names should be in the hierarchy they can be, if they're not they can do that too. (But I'd read your data to state that Mirlo primavera is superior to/above/a parent of/broader than Merle d'Amérique and I doubt that's the intent.)

Screenshot 2024-06-06 at 07 58 29
Jegelewicz commented 2 months ago

Using the example, I found the American Robins with this search

https://arctos.database.museum/search.cfm?customoidoper=LIST&tax_trm_1=American%20Robin&tax_src_1=Arctos%20Common%20Names

So for search, if we could make the "Common Name" field only search the Common Name source, that could still be useful and wouldn't require knowledge of where common names are stored.