ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

Code Table Request - USNPC: United States National Parasite Collection #5995

Open dustymc opened 1 year ago

dustymc commented 1 year ago

Instructions

This is a template to facilitate communication with the Arctos Code Table Committee. Submit a separate request for each relevant value. This form is appropriate for exploring how data may best be stored, for adding vocabulary, or for updating existing definitions.

Reviewing documentation before proceeding will result in a more enjoyable experience.


Initial Request

Goal: Describe what you're trying to accomplish. This is the only necessary step to start this process. The Committee is available to assist with all other steps. Please clearly indicate any uncertainty or desired guidance if you proceed beyond this step.

All USNPC: United States National Parasite Collection should be replaced with other ID type = other identifier and issued by agent U. S. National Parasite Collection

Proposed Value: Proposed new value. This should be clear and compatible with similar values in the relevant table and across Arctos.

Proposed Definition: Clear, complete, non-collection-type-specific functional definition of the value. Avoid discipline-specific terminology if possible, include parenthetically if unavoidable.

Context: Describe why this new value is necessary and existing values are not.

Table: Code Tables are http://arctos.database.museum/info/ctDocumentation.cfm. Link to the specific table or value. This may involve multiple tables and will control datatype for Attributes. OtherID requests require BaseURL (and example) or explanation. Please ask for assistance if unsure.

Collection type: Some code tables contain collection-type-specific values. collection_cde may be found from https://arctos.database.museum/home.cfm

Priority: Please describe the urgency and/or choose a priority-label to the right. You should expect a response within two working days, and may utilize Arctos Contacts if you feel response is lacking.

Available for Public View: Most data are by default publicly available. Describe any necessary access restrictions.

Project: Add the issue to the Code Table Management Project.

Discussion: Please reach out to anyone who might be affected by this change. Leave a comment or add this to the Committee agenda if you believe more focused conversation is necessary.

Approval

All of the following must be checked before this may proceed.

The How-To Document should be followed. Pay particular attention to terminology (with emphasis on consistency) and documentation (with emphasis on functionality).

Rejection

If you believe this request should not proceed, explain why here. Suggest any changes that would make the change acceptable, alternate (usually existing) paths to the same goals, etc.

  1. Can a suitable solution be found here? If not, proceed to (2)
  2. Can a suitable solution be found by Code Table Committee discussion? If not, proceed to (3)
  3. Take the discussion to a monthly Arctos Working Group meeting for final resolution.

Implementation

Once all of the Approval Checklist is appropriately checked and there are no Rejection comments, or in special circumstances by decree of the Arctos Working Group, the change may be made.

Review everything one last time. Ensure the How-To has been followed. Ensure all checks have been made by appropriate personnel.

Make changes as described above. Ensure the URL of this Issue is included in the definition.

Close this Issue.

DO NOT modify Arctos Authorities in any way before all points in this Issue have been fully addressed; data loss may result.

Special Exemptions

In very specific cases and by prior approval of The Committee, the approval process may be skipped, and implementation requirements may be slightly altered. Please note here if you are proceeding under one of these use cases.

  1. Adding an existing term to additional collection types may proceed immediately and without discussion, but doing so may also subject users to future cleanup efforts. If time allows, please review the term and definition as part of this step.
  2. The Committee may grant special access on particular tables to particular users. This should be exercised with great caution only after several smooth test cases, and generally limited to "taxonomy-like" data such as International Commission on Stratigraphy terminology.
dustymc commented 1 year ago

I will plan on proceeding with this about 2023-03-28 if there are no objections.

I will proceed immediately upon approval of each of the involved collections.

Data: temp_usnpc_united_states_national_parasite_collection.csv.zip

Summary: guid_prefix numrecs
MSB:Host 3
MSB:Mamm 55
MSB:Para 42
UAM:Mamm 8
USNPC:Para 814

Users: @msbparasites @campmlc @amgunderson @AdrienneRaniszewski @jldunnum

See also https://github.com/ArctosDB/arctos/issues/5771

campmlc commented 1 year ago

This needs to be involved in further discussion as the USNPC catalog numbers are type specimens that are linked in multiple formats to hosts and parasites in multiple collections in Arctos and in external institutions. Furthermore, the USNPC specimens were transferred to the Smithsonian and recataloged with new numbers across multiple divisions. I support adding the issued by metadata, but I do not support merging this ID with other identifier at this time. Propose if a merger does occur, the ID type be catalog number. But affected collections should arrange time to discuss this so that relationships and searches on type specimens and symbiotypes are not affected.

campmlc commented 1 year ago

@gracz-UNL

msbparasites commented 1 year ago

I am happy to participate in that. We could also discuss the USNM (Smithsonian) numbers as well.

dustymc commented 1 year ago

transferred to the Smithsonian

Sortamaybe related: https://github.com/ArctosDB/arctos/issues/5523

msbparasites commented 1 year ago

I am trying to find understand the yays and nays of making these other identifiers. One of the concerns by Mariel is that there are links, for example in parasites, we have MSB worms that came from the same individual host as worms that the USNPC (same for USNM) has. And those records are linked. And in this case it is referenced as "same lot as". Does removing the USNPC label to other identifier, compromise that relationship? Either by links or visualizing it or searching for relationships of "same lot as"? I think I don't understand why it is ok for some and not others, and if it is operator or losing ease of searching. This can also be answered later during a more detailed discussion.

campmlc commented 1 year ago

I agree this is what we need to confirm. We have also discussed using "entities" for parasite/host records scattered across institutions, and we need to leave that option open. It may be a simple interface or terminology change that is needed, but that must be resolved first.

On Tue, Mar 14, 2023 at 2:49 PM msbparasites @.***> wrote:

  • [EXTERNAL]*

I am trying to find understand the yays and nays of making these other identifiers. One of the concerns by Mariel is that there are links, for example in parasites, we have MSB worms that came from the same individual host as worms that the USNPC (same for USNM) has. And those records are linked. And in this case it is referenced as "same lot as". Does removing the USNPC label to other identifier, compromise that relationship? Either by links or visualizing it or searching for relationships of "same lot as"? I think I don't understand why it is ok for some and not others, and if it is operator or losing ease of searching. This can also be answered later during a more detailed discussion.

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5995#issuecomment-1468822460, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBD6235A66D3SFT767DW4DKUXANCNFSM6AAAAAAV243LZY . You are receiving this because you were mentioned.Message ID: @.***>

dustymc commented 1 year ago

understand the yays and nays

So say we all!

removing the USNPC label to other identifier, compromise that relationship?

No.

(But there may be details, such as those mentioned, to work out - this might be a good candidate for testing.)

msbparasites commented 1 year ago

Fantastic. Maybe Mariel, Teresa and I can all be in same room when the time comes. Might save messaging.....

campmlc commented 1 year ago

https://github.com/ArctosDB/arctos/issues/6004

campmlc commented 1 year ago

@dustymc @msbparasites I would like to use this opportunity to fix a long-standing problem with the USNPC numbers. The actual catalog numbers that are associated with the types in the literature are in the USNPC:Para portal as other identifiers, but they really should be the USNPC:Para catalog numbers. We didn't realize at the time all that would happen with these type specimens getting sent off to the Smithsonian and split up into different collections. A search of USNPC at GBIF yields only a couple records of trematodes that are cited in a paper. On the USNM Invertebrate Zoology search page, they have now put up an option to search for "USNPC" "legacy numbers" - but nothing I searched on yielded anything. And these "legacy numbers" are actually the catalog numbers for the type specimens that are cited in all the literature. For this reason, I would like to have the USNPC:Para numbers correspond to the original "legacy" catalog numbers prior to the move of the collection to USNM. @dustymc - possible, and if so, can I send you a lookup file? If this can be accomplished, then we can directly link the USNPC:Para urls to many other records in Arctos that currently reference the legacy catalog numbers as "United States National Parasite Collection" identifiers.

campmlc commented 1 year ago

Potential complication with the above request -there are some original catalog numbers that use decimal suffixes, e.g. 36944.01 and .02. And the two different suffixes are different species. I supposed I could give them one catalog number, 36944, and then two different equally valid IDs, all sharing collecting event and host info - and use part attributes to separate out the two slides? Other ideas? @Jegelewicz

campmlc commented 1 year ago

Note that these numbers will not need to be numeric for future incrementation - this will be a static collection.

dustymc commented 1 year ago

supposed I could give them one catalog number,

I've been lost in this mess for quite some time, but it seems like the absolute LAST thing these need is yet another identifier.

Even ignoring the weird nonnumerics (which are possible but will limit access to tools) these have been cataloged for quite some time and putting another bend in the already-convoluted path between data and publications seems right up there as far as Worst Practices (tm) goes.

I'd recommend redirects. I did this:

Screenshot 2023-03-17 at 1 25 58 PM

and now https://arctos-test.tacc.utexas.edu/guid/USNPC:Para:10.1.2.3.00000000000000 (simulation of a previous identifier, however weird) does stuff

Or let me know where I got lost with some details if that's not it....

msbparasites commented 1 year ago

Just FYI, on the Smithsonian invertebrate zoology database, when you go to the search tab "search by field" there are six other tabs, and the last one is to search a USNPC catalog number. That is what I have used to find Rausch worms. Also, I am not sure what you mean by shipped off to different collections? They are all in the Invertebrate Zoology collection, but they are physically housed in different places. I don't know if that matters, but at least all the USNPC numbers that I have searched for, I have found, IF of course, I knew the USNPC#. And I think at least in the MSB Para records, I have both listed, USNPC and USNM

msbparasites commented 1 year ago

If something is a different species, then it should not get the same Catalog #? I think the way the USNPC did things was strange because the number was the accession and the decimal was if there were the number of slides in that accession, regardless of spp.?

campmlc commented 1 year ago

@dusty what I am trying to do is to "undo" the mistake of having added yet another identifer that means nothing. When we cataloged the Rausch specimens in Arctos, they were autoloaded into USNPC:Para starting with "1" and going to "811" because that's how many records there were, not because those numbers meant anything. The actual USNPC: United States National Parasite Collection numbers are what I would like to associate with the records. But unfortunately, that includes things that USNPC used like 36944.01 and 36944.02 which are different species. My guess is we did this because of this problem - and it may be why the SI did the same thing, giving them yet another series of USNM numbers. I would like to have our USNPC:Para collection publish to GBIF, which it is currently not because of this problem. Once there, we can try to see about linking our GBIF records with whatever the SI eventually publishes there.

campmlc commented 1 year ago

@msbparasites I'm glad you have the USNM numbers in MSB:Para. If we can get the USNPC:Para numbers set up correctly, we can link directly from the MSB:Para records to the correct USNPC:Para records by the actual USNPC catalog number, and not by randomly assigned things like "USNPC:Para:1" that, if anything, are incorrect as they imply this is the same as the legacy USNPC 1 catalog number, which it is not.

campmlc commented 1 year ago

@msbparasites when I search by field for any of the USNPC numbers in my list, nothing shows? Try 49356

campmlc commented 1 year ago

@dustmc @Jegelewicz to my knowledge this USNPC catalog has not been published to VertNet or GBIF - or they are not searchable from those portals. So fixing this now should not break anything, as no one has any idea to search for these in the Arctos portal, so the current numbers should not have ever been used or cited. Once the numbers are fixed, we can make the correct linkages between Rausch hosts, parasites in the other Arctos collections and the USNPC type records, which include the primary data directly from USNPC (we were given a download from their database).

campmlc commented 1 year ago

So if non-numeric catnums are possible, we could proceed with this.

campmlc commented 1 year ago

Here is the proposed conversion file. Note that I had already done a test conversion on one record, USNPC:Para:36944, to an integer value to see if it was possible. It needs conversion to USNPC:Para 36944.02 non-numeric. The related entry, USNPC:Para:660, needs conversion USNPC:Para:36944.01.

usnpc lookup for conversion.zip

If there is a better way of doing this, please let me know.

msbparasites commented 1 year ago

49356

Is this a USNPC # that is in a paper? And it is not a RLR#? I don't find it, where is it from? I can't find it in Arctos, unless I am doing something wrong....

campmlc commented 1 year ago

https://arctos.database.museum/guid/USNPC:Para:49356

campmlc commented 1 year ago

I think there is something wrong with the SI invert collection web search page, because I haven't been able to get it to return any values even when I search on a USNM catalog number.

campmlc commented 1 year ago

@msbparasites I just gave you USNPC access! Not sure why you weren't on there in the first place.

msbparasites commented 1 year ago

yay!!! Yea, feels like something is wrong with SI, however, if you just search that number, you get other invert taxa, not parasites. I read the paper looking for those numbers, and it lists both 49355 (adult) and 49356 (larva) but what is odd is that he did not actually call it a holotype, which I think according to the code...means it cannot be listed as that.

If I search on the USNM # 1348186 I get the record for 49355 and here is how they have the USNPC # 049355.00 and NOW it is coming back to me, I always forget that I have to actually put the '0' in front of the number (049355) because when you do, you get the record you are looking for

campmlc commented 1 year ago

The "holotype" info on this one came from the USNPC database, so I would hope they listed it appropriately. But happy to convert it to whatever you think. leading zeros . . . yet another reason to publish the correct number at USNPC:Para. I would be very surprised if these numbers were all consistently cited in publications with leading zeros? You tell me.

msbparasites commented 1 year ago

I have no idea where the zeros came in. I don't think I have ever seen in a publication leading with a zero. The one paper I have with their numbers, I did not put the zero and I looked back at correspondence and I didn't see that they put the zero there either. And of course none of the UCNPC# lead with zeros. I feel like this is also a discussion that would have to happen with SI? And there is no active link to the USNM # collections?

msbparasites commented 1 year ago

oh i just saw Mariel your response to Issue #6024, with the ark linking directly to the USNM, I don't really understand what it is, but the mammal record came right up and that was awesome. It just isn't clear that it is USNM? Maybe doesn't matter?

dustymc commented 1 year ago

I added an ID to https://arctos.database.museum/guid/MSB:Para:28915 - hopefully it'll clarify things, feel free to delete it if not.

msbparasites commented 1 year ago

Well, from my perspective.... I like it and it clarifies things. I was also thinking in general, I might not click things if I don't know where it goes.... or think it is not anything I need.

campmlc commented 1 year ago

That works! Although I changed the issued by to Smithsonian Division of Mammals. I'm preparing an ID bulkload to add these records to the appropriate Division, rather than just the institution as a whole, which will allow linkage to the appropriate catalog number. So happy for the arks to be entered in that format.

campmlc commented 1 year ago

Is there any problem with proceeding with the conversion below? I would like to finish the project - it's been delayed over a decade now. Once we have the correct numbers associated with the USNPC:Para guid, we can then build the connections between related records in Arctos collections and USNM. Happy to answer any questions. @dustymc

Here is the proposed conversion file. Note that I had already done a test conversion on one record, USNPC:Para:36944, to an integer value to see if it was possible. It needs conversion to USNPC:Para 36944.02 non-numeric. The related entry, USNPC:Para:660, needs conversion USNPC:Para:36944.01.

usnpc lookup for conversion.zip

If there is a better way of doing this, please let me know.

dustymc commented 1 year ago

changed the issued by

Yep, that's one of the niceties of this - it's easy to be as precise as known without overstating anything, and to change that when someone who knows more comes along.

proceeding with the conversion

I do not think I adequately understand the perceived benefits of this, and won't ever think it's a good idea to break identifiers. There are ~80 records in Arctos using these, and as a whole bunch of Issues demonstrate there are probably many more which should be but got lost and used the wrong type. It's at least possible to coordinate with or compensate for the former, the latter will probably just be lost if things get more cryptic. They are of course your data and you can do with them whatever you want, but I can't see how this will add clarity.