ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

New Identifier Type: Arctos record GUID #7808

Closed campmlc closed 4 months ago

campmlc commented 4 months ago

Current Status

The core of this is running in test, feedback is welcome.


Definition

Arctos record identifiers or GUIDs when used as identifiers, primarily for the purposes of forming relationships. Only Arctos record identifiers may be used here; Arctos record identifiers may not be used in other identifier types, except Arctos:Entity when used as Organism ID. Automation will correct issued by agent, and will attempt to guess (and leave remarks) if "Triplet" is provided. Value should be added to prefix when available.


In Limbo

Can we eliminate a huge trap, https://github.com/ArctosDB/arctos/issues/7808#issuecomment-2135899234 https://github.com/ArctosDB/arctos/issues/7836?


Original Issue

Problem: Need to distinguish and standardize Arctos GUIDs/Urls as distinct "identifier" type

Describe what you're trying to accomplish Make it easier to identify and link to arctos urls in a standardized and internally controlled way

Describe the solution you'd like New ID type: "Arctos record identifier"

Arctos record GUID - The full url of the related Arctos catalog record. Must begin with https://arctos.database.museum/guid/ followed by an Arctos record identifier (the triplet).

The special type would facilitate the correctness of internal links by

  1. enforcing values that begin with https://arctos.database.museum/guid/ in bulkloaded data (both the main bulkloader and the identifier bulkload tool)
  2. enforcing the use of the correct issuer based upon the triplet prefix part of the url in the value in bulkloaded data (both the main bulkloader and the identifier bulkload tool)
  3. pre-entering https://arctos.database.museum/guid/ in data entry forms when the type is selected (both data entry and in-record additions)
  4. adding the appropriate issuer based upon the triplet prefix part of the url in the value (both data entry and in-record additions)
  5. Disallowing values that approximate https://arctos.database.museum/guid/ in other types.

Describe alternatives you've considered increasing chaos

Additional context Add any other context or screenshots about the feature request here.

Priority Wildfire

dustymc commented 4 months ago

This is acceptable as long as

  1. I can control the value (Arctos GUID) and issuer, and
  2. I can disallow those things in identifiers of other types

I am of course happy to help clean up any existing problems which would prevent implementation.

See also https://github.com/ArctosDB/arctos/discussions/5310

campmlc commented 4 months ago

This is acceptable as long as

  1. I can control the value (Arctos GUID) and issuer
    YES, agree
  2. I can disallow those things in identifiers of other types YES, agree
dustymc commented 4 months ago

From @mkoo in https://github.com/ArctosDB/arctos/issues/6738:


From the AWG discussion: A new identifier would be created called Arctos record identifier which would expressly be the full URL of the Arctos record.

The data entry form needs to reflect that users would be able to add the catalog record or DwC triple and the domain etc (https://arctos.database.museum/guid/) be appended. Although the builder could do that already.

Other suggested UI tweaks-- the Edit form on the record page:

Firefox_Screenshot_2024-05-23T19-49-20 310Z

There is also agreement that we would remove the type= "institutional catalog number" and replace with simply "identifier" and the appropriate Issued by for consistent and discoverable other ids.

dustymc commented 4 months ago

appended

Yes, I can potentially "I think you mean...." and manipulate the identifier, BUT there's also just about a 100% chance I'll occasionally mess that up. (So perhaps I should throw the 'input' into remarks or something if we get there.) Very strongly suggest we NOT do that, instead embrace https://github.com/ArctosDB/arctos/discussions/5310 (which leaves no room for confusion, doesn't require me to guess what a user might have intended, and doesn't become a liability at the borders of Arctos).

Prefix

Not a good discussion until https://github.com/ArctosDB/arctos/issues/6687 is resolved (prefix may not survive).

remove the type= "institutional catalog number"

For the record: I'm very hesitant about adding more types at all, and my anxiety over introducing yet another type is greatly amplified by the lack of movement on the many existing identifier issues (much of https://github.com/ArctosDB/arctos/issues?q=is%3Aissue+is%3Aopen+identifier+prefix+label%3A%22Priority+-+Wildfire+Potential%22 , but there are still no issues for a bunch of other nonsensical types - eg there are still types for the media/object/device which carries identifiers!!). Clearly much of the confusion leading up to this proposal involves becoming lost in those arbitrary and unnecessary types. Removing what is perhaps the most confusing (and least consistently used) type is a great start, but is there any possible way we can commit to fully normalizing the ecosystem and getting ourselves out of this mudbog as we're adding this?

Jegelewicz commented 4 months ago

remove the type= "institutional catalog number"

Can we just stick to this one (very nice) thing and address that elsewhere? I'd hate to see this mired in arguments about other things. Also, I like the idea of type being functional, this could help us as we work through the remaining types.

dustymc commented 4 months ago

An addition is the opposite of the simplification this is looking for. I definitely don't want any arguments, but I also think that nearly all of them involve getting lost in the complexity, much of which is brought about by the multitude of unnecessary types. Removing the thing that's clearly confusing users seems in line with the stated goals.

functional

If you mean having rules attached to types and agents, that has always been on offer. (But I think nobody's quite sure what to ask for because of the clutter of so many types, probably complicated by the surprising "what's a GUID?" conversation.) I'd be happy to work up a proposal if anyone's interested, open an issue.

campmlc commented 4 months ago

Just a note: most of the usage ( but not all) of institutional catalog number is happening because we lack the clear alternative requested here. Once we have a clear and functional alternative, we can then move towards replacing and fixing the institutional catalog number ids. I absolutely agree with @Jegelewicz that we should not conflate these two issues.

dustymc commented 4 months ago

most of the usage ( but not all) of institutional catalog number is happening because we lack the clear alternative requested here

See https://github.com/ArctosDB/arctos/issues/7808#issuecomment-2127875164, this cannot exist as long as those things exist, I can't create this except while also moving them.

I will not support adding more muck in which to get lost. This can and should be a simple matter of sorting identifiers in two ways (here for the resolvable, not-here for the rest). There should be no ambiguity in the data, I don't think I need anything but an OK. (But if this again starts looking realistic I can provide data here for review.)

campmlc commented 4 months ago

This affects active data entry protocols across multiple collections in my institution. The only way to accomplish this in a short amount of time is to add the new identifier first so that the correct identifiers can be added and shown to be functional, and then communicate the need to change workflows. This can happen quickly if we do it right now - we have a couple of weeks before the summer cataloging push starts up. Collections need to know that existing data will not be lost from older records. This is the "social" part here - which must be included for this to work. We don't want a repeat of last year. As soon as the new "Arctos record ID" format is up and running, @dusty can convert all existing Arctos guid "identifiers" without problem. The remaining "institutional catalog numbers" can then be prioritized for conversion once we are certain that all existing Arctos relationships have been appropriately captured and converted.

campmlc commented 4 months ago

So if I understand @dustymc correctly, we can proceed right away with the resolvable identifiers in Arctos - I agree completely.

dustymc commented 4 months ago

https://github.com/ArctosDB/arctos/issues/7808#issuecomment-2128094156 is technically incompatible with what was discussed. The concerns that a new dedicated type might somehow cause data loss are - well, guess I don't have a word, but it's whatever you'd use to describe something that just can't happen. The training and adaptation should be straightforward: use the thing that doesn't produce an error (which hopefully will be self-explanatory once the thing that's obviously be causing arbitrary data is gone).

Now https://github.com/ArctosDB/arctos/issues/7808#issuecomment-2128099917 is making me think I've misunderstood something again.

I need the OK to

campmlc commented 4 months ago

We are in agreement on all above, except the last step, which requires a temporal delay of a week or two as collections need to be notified to change workflows, otherwise we have a lot of extremely upset people trying to do things that suddenly cease to work with no notification. This includes dealing with records currently in the bulkloader and in bulkload prep.

campmlc commented 4 months ago

Regarding what to call this - see #5310

I support calling the Arctos GUID the full URL. This is also what we are defining the GUID as in the Arctos paper per the AWG discussion 5-24-2024, as the url created based on the Arctos "record identifier". @ccicero

Revised wording: "Each cataloged record has an Arctos Globally Unique Identifier (GUID) that is constructed from the record identifier (e.g., https://arctos.database.museum/guid/APSU:Fish:1079)."

dustymc commented 4 months ago

last step ... suddenly cease to work with no notification.

That is precisely my point, but the implementation will not/can not work as I believe you're expecting it to.

Implementing this in the only way it can be done will be a change in workflow, whether we drag some ancillary bits out or not. That is what was agreed to in the meeting and in https://github.com/ArctosDB/arctos/issues/7808#issuecomment-2127896938. Surely the folks entering data aren't THAT difficult to talk to, and we do have a communications team who I'm sure would be willing to help.

campmlc commented 4 months ago

Can I request a csv of the existing data in Arctos that use "institutional catalog number"? I don't want to hold this up, but I don't want to be responsible for data loss, and I don't want to presume the rest of the community agrees to conversion of existing data and new workflows without notice.

campmlc commented 4 months ago

See https://github.com/ArctosDB/arctos/discussions/5310#discussioncomment-9540549 re Arctos GUID vs record identifier.

Jegelewicz commented 4 months ago

The special type would facilitate the correctness of internal links by

  1. enforcing values that begin with https://arctos.database.museum/guid/ in bulkloaded data (both the main bulkloader and the identifier bulkload tool)
  2. enforcing the use of the correct issuer based upon the GUID prefix part of the url in the value in bulkloaded data (both the main bulkloader and the identifier bulkload tool)
  3. pre-entering https://arctos.database.museum/guid/ in data entry forms when the type is selected (both data entry and in-record additions)
  4. adding the appropriate issuer based upon the GUID prefix part of the url in the value (both data entry and in-record additions)

All possible?

dustymc commented 4 months ago

See first of https://github.com/ArctosDB/arctos/issues/7808#issuecomment-2127950010 re: (3); I'm hesitantly willing to try, but I do suck at reading minds through malformed identifiers and will occasionally (at best!) mangle that. Defensible procedures would involve not making me guess, even if that is implemented. Everything else: Yup, no problem, that's what I said in https://github.com/ArctosDB/arctos/issues/7808#issuecomment-2127875164.

Missing is (5), which is critical to this: Disallowing values that approximate https://arctos.database.museum/guid/ in other types.

campmlc commented 4 months ago

Yes,I agree with 5 as well

mkoo commented 4 months ago

Those 5 conditions are essential!

dustymc commented 4 months ago

EDIT

new data: https://docs.google.com/spreadsheets/d/1bCG8gFuTO5QC7JunnOBhay4ZCDkcaw81Qd-OwgihGGQ/edit#gid=1169145992


Original

If this is to proceed, the first decision will be what we do with the ~15K current identifiers that look like, but are not, valid Arctos GUIDs.

temp_rec_id_not_valid.csv.zip

Excluding 'self' relationships from this would exclude most of these, but that seems like a potential trap of some sort.

There might be reasons to allow non-current GUIDs, but then I would lose any ability to exclude random things that people type, and that seems critical to this (especially having now seen the data!).

Much of this is ALMNH changing GUID Prefix (ACK!!), perhaps those could be stripped to triplets without any real loss of persistence.

I'm not sure what to do from here, but I am sure that this type cannot be just another trashcan.

This feels like it's probably going to need some sort of ad-hoc committee, @campmlc perhaps you'd organize something?

campmlc commented 4 months ago

Looking over the file, about 10K are ALMNH, another 4K+ are CHAS, and the remaining 1K are miscellaneous collections. I would like to request that we create the new ID type with all the needed constraints so that we can use this for incoming accessions that are already coming in for the summer, and then work to deal with these oddities. Non, ALMNH, non-CHAS: <html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

  |   -- | -- Row Labels | Count of GUID_PREFIX BYU:Herp | 1 DGR:Mamm | 1 DMNS:Inv | 3 KNWR:Env | 4 MMNH:Edu | 2 MSB:Bird | 10 MSB:Fish | 70 MSB:Herp | 2 MSB:Host | 26 MSB:Mamm | 185 MSB:Para | 217 MVZ:Bird | 3 MVZ:Egg | 11 MVZ:Herp | 4 MVZ:Mamm | 83 MVZObs:Herp | 1 NHSM:Arc | 2 NMMNH:Paleo | 2 NMU:Mamm | 14 OWU:Fish | 4 OWU:Inv | 1 UAM:Art | 4 UAM:Bird | 38 UAM:EH | 15 UAM:Ento | 141 UAM:Herb | 2 UAM:Inv | 2 UAM:Mamm | 133 UCM:Bird | 2 UCM:Herp | 1 UCM:Mamm | 20 UMZM:Bird | 2 UTEP:ES | 1 UTEP:Herb | 1 UTEP:Herp | 2 UWBM:Herp | 2 UWYMV:Egg | 4 UWYMV:Mamm | 2 Grand Total | 1018

campmlc commented 4 months ago

A lot of these look like errors introduced by Arctos tools? I can't imagine someone entering some of these in this format. But I also see that all these are http vs https - another issue?

http://arctos.database.museum/guid/CHAS:Inv:14539 http://arctos.database.museum/guid/CHAS:Inv:CHAS ENT-13968a http://arctos.database.museum/guid/UWBM:Herp:UWBM:Herp:7624 http://arctos.database.museum/guid/UAM:Bird:unknown http://arctos.database.museum/guid/UAMObs:Ento:229692 http://arctos.database.museum/guid/UAM:EH:Ruckstuhl File Number 8 http://arctos.database.museum/guid/UWYMV:Mamm:M-895

Jegelewicz commented 4 months ago

The first decision will be what we do with the ~15K current identifiers that look like, but are not, valid Arctos GUIDs.

I suggest that these all have http://arctos.database.museum/guid/ removed from the value and a remark added "previously recorded as x" where x is the current value.

Those ALMNH ones should already have redirects, so nothing is lost there?

campmlc commented 4 months ago

Others are linking to valid records that are not yet cataloged in Arctos - e.g. http://arctos.database.museum/guid/HWML:Para:74826 which is a parasite of the linked MSB Mamm record, cited in a publication: Elisa Pucu, Marcela Lareschi, Scott L. Gardner. 2014. Bolivian Ectoparasites: A Survey of the Fleas of Ctenomys (Rodentia: Ctenomyidae). Comparative Parasitology 81(1):114-118.. It is assigned a catalog number at HWML, but not yet cataloged there in Arctos. This is more problematic, because one collection should not have to hold back on capturing relationships just because the related collection is slower on cataloging.

campmlc commented 4 months ago

For identifiers that don't resolve for the above reason but otherwise meet all the criteria for an Arctos guid, can we just leave them as is and get periodic reports for "unresolved Arctos relationships" to sort out what is wrong? That would find http://arctos.database.museum/guid/HWML:Para:74826 and also similar situations where someone entered the related catalog number incorrectly, e.g. 74426. It would also automatically resolve once the record was entered, with no action needed by the originating collection.

dustymc commented 4 months ago

create the new ID type with all the needed constraints ... and then

That is not technically viable. I don't know how to communicate that more clearly than in https://github.com/ArctosDB/arctos/issues/7808#issuecomment-2127875164.

these all have http://arctos.database.museum/guid/ removed from the value and a remark added "previously recorded as x"

That seems reasonable to me, agree nothing would be lost (and of course I'll leave CSV at every step if this becomes actionable).

not yet cataloged

There are about a million ways for that to go wrong, and about a million easy ways to avoid the situation. This type cannot become yet another garbage can. I propose we don't preemptively kill it on the easily-avoidable fringe use cases.

campmlc commented 4 months ago

If we can implement even one easy way of the supposed million to deal with timing issues related to different collections cataloging related objects in Arctos at different times, that would be great. I do want to hold someone to that promise. Otherwise, if this is the only way we can move forward, let's do it.

campmlc commented 4 months ago

I don't suppose there is any way this could be implemented first in test?

Jegelewicz commented 4 months ago

I just fixed all but one of the UTEP problems - the last one appears to be "the related thing isn't cataloged". I can see how this means that relationships end up never being made (I know about it on my end, but the other collection isn't finished cataloging, so I can't record the relationship, and they don't because they don't know, then it never gets recorded). So I see the value in checking if the related item exists, but I also see the value in recording one side of a relationship, with the other side coming later. If we can't do that, why have the bot?

Checking whether the format of the identifier is correct is great! Ensuring that the link works at the time of entry maybe not so much because it means MSB mammals cannot record their parasite links until MSB Para has cataloged their records. Requiring both records to exist before a relationship can feels like a trap that means the relationships NEVER get recorded.

dustymc commented 4 months ago

implemented first in test?

Yes.

why have the bot?

To make valid reciprocal relationships! If we allow that thing that'll totally happen tomorrow then we also allow http://arctos.database.museum/guid/UAM:Bird:unknown to continue to exist and this is just another garbage can.

Make a relationship using whatever identifiers are available (generate a UUID if there's not something 'native' handy, that's why the type exists - and of course file an issue if that's at all complicated, it should not be), then file an issue for help in upgrading it once things are cataloged. "The bot" is but one of a potential swarm, this still looks like an easily avoidable (and mostly theoretical) situation, albeit one that's definitely capable of killing this idea.

campmlc commented 4 months ago

Another issue is that some of these that have re-directs should not be messed up, or the redirect will not function. See for example: http://arctos.database.museum/guid/MSB:Mamm:270088 which redirects to https://arctos.database.museum/guid/MSB:Mamm:274455

dustymc commented 4 months ago

re-directs should not be messed up,

I do not know what that means. https://arctos.database.museum/guid/MSB:Mamm:270088 just 404s because it doesn't exist

arctosprod@arctos>> select count(*) from flat where guid='MSB:Mamm:270088';
 count 
-------
     0

and there's nothing in redirects to suggest it should do anything else.

arctosprod@arctos>> select * from redirect where old_path ilike '%MSB:Mamm:270088%' or new_path  ilike '%MSB:Mamm:270088%';
 redirect_id | old_path | new_path 
-------------+----------+----------
(0 rows)

Perhaps a topic better addressed in another issue?

campmlc commented 4 months ago

consensus on 5-24-24 call: new ID type to be called "Arctos record GUID"

Jegelewicz commented 4 months ago

Plan to demo on June 6

Jegelewicz commented 4 months ago

Adjust things in the bulkloader to match this as possible.

dustymc commented 4 months ago

Working definition:

Arctos record identifiers or GUIDs when used as identifiers, primarily for the purposes of forming relationships. Only Arctos record identifiers may be used here; Arctos record identifiers may not be used in other identifier types, except Arctos:Entity when used as Organism ID. Automation will correct issued by agent, and will attempt to guess (and leave remarks) if "Triplet" is provided.

and proposed update to https://arctos.database.museum/info/ctDocumentation.cfm?table=ctcoll_other_id_type#identifier:

This type is proper for a wide range of identifiers that can be disambiguated by the agent that issued them. This identifier type is not indicative of low-quality data but allows for ease of identifier searches across specific uses for specific purposes. NOTE: Use "Arctos record GUID" for local record identifiers/relationships.

dustymc commented 4 months ago

Moved to https://github.com/ArctosDB/arctos/issues/7836

campmlc commented 4 months ago

Csv please?

dustymc commented 4 months ago

You can download CSV from the sheet linked in the comment directly above yours.

dustymc commented 4 months ago

-moved to top-

Jegelewicz commented 4 months ago

Any way to catch this and disallow it?

image

Should we? It does leave the door open for people to select the wrong type and make bad (or no) links, but it also seems like it would be a lot of computing to check for it.

dustymc commented 4 months ago

Any way to catch this and disallow it?

No, that's kinda the point of https://github.com/ArctosDB/arctos/discussions/5310. People who work with Arctos are probably going to make some assumptions about that, we've been reinforcing those assumptions by pretending the universe stops at the data we can control, but there's also no way to tell if that's a perfectly valid (local) identifier that didn't originate in Arctos. (That might be a bad example, but there are in fact at least three "UAM:Mamm"s out there.)

It's not a CPU thing, it's a "how good identifiers work" thing.

This type won't obviate any need for reading documentation. It will make it easy to get the details right once/hard to get them wrong you've gotten close, but it will absolutely not prevent someone from making huge messes.

(Moving on https://github.com/ArctosDB/arctos/issues/7808#issuecomment-2135899234 would prevent a very common flavor of those messes so - please, anyone?)

Jegelewicz commented 4 months ago

Data entry let me do this

image

which I'm guessing will fail when I load the record?

dustymc commented 4 months ago

guessing

Screenshot 2024-05-30 at 07 25 36
dustymc commented 4 months ago

EDIT: See https://github.com/ArctosDB/arctos/issues/7837 for this conversation, I can provide data there after this issue is implemented.


Going the other way, the collection agents are issuing all sorts of nonsense.


   c   |                                              issuedby                                              |                      collectionid                       | guid_prefix |              other_id_type               
-------+----------------------------------------------------------------------------------------------------+---------------------------------------------------------+-------------+------------------------------------------
     6 | Alabama Museum of Natural History Bird Collection                                                  | https://arctos.database.museum/collection/ALMNH:Bird    | ALMNH:Bird  | identifier
    22 | Alabama Museum of Natural History Geology Collection                                               | https://arctos.database.museum/collection/ALMNH:Geo     | ALMNH:Geo   | institutional catalog number
     1 | Alabama Museum of Natural History Mammal Collection                                                | https://arctos.database.museum/collection/ALMNH:Mamm    | ALMNH:Mamm  | identifier
     1 | Brigham Young University Life Science Museum Amphibian and Reptile Collection                      | https://arctos.database.museum/collection/BYU:Herp      | BYU:Herp    | identifier
   575 | Chicago Academy of Sciences Bird Collection                                                        | https://arctos.database.museum/collection/CHAS:Bird     | CHAS:Bird   | identifier
    61 | Chicago Academy of Sciences Bird Eggs Collection                                                   | https://arctos.database.museum/collection/CHAS:Egg      | CHAS:Egg    | identifier
     1 | Chicago Academy of Sciences Ethnology and History Artifacts Collection                             | https://arctos.database.museum/collection/CHAS:EH       | CHAS:EH     | identifier
     5 | Chicago Academy of Sciences Mollusc Collection                                                     | https://arctos.database.museum/collection/CHAS:Inv      | CHAS:EH     | identifier
    29 | Chicago Academy of Sciences Insect Collection                                                      | https://arctos.database.museum/collection/CHAS:Ento     | CHAS:Ento   | identifier
  1908 | Chicago Academy of Sciences Mollusc Collection                                                     | https://arctos.database.museum/collection/CHAS:Inv      | CHAS:Ento   | identifier
    10 | Chicago Academy of Sciences Teaching Collection                                                    | https://arctos.database.museum/collection/CHAS:Teach    | CHAS:Ento   | identifier
     2 | California Desert Studies Center Herbarium                                                         | https://arctos.database.museum/collection/CDSC:Herb     | CHAS:Herb   | identifier
   778 | Chicago Academy of Sciences Herbarium                                                              | https://arctos.database.museum/collection/CHAS:Herb     | CHAS:Herb   | identifier
    29 | Chicago Academy of Sciences Herbarium                                                              | https://arctos.database.museum/collection/CHAS:Herb     | CHAS:Herb   | institutional catalog number
    10 | Chicago Academy of Sciences Amphibian and Reptile Collection                                       | https://arctos.database.museum/collection/CHAS:Herp     | CHAS:Herp   | identifier
    82 | Chicago Academy of Sciences Mollusc Collection                                                     | https://arctos.database.museum/collection/CHAS:Inv      | CHAS:Inv    | identifier
     2 | Chicago Academy of Sciences Mammal Collection                                                      | https://arctos.database.museum/collection/CHAS:Mamm     | CHAS:Mamm   | identifier
    12 | Chicago Academy of Sciences Herbarium                                                              | https://arctos.database.museum/collection/CHAS:Herb     | CHAS:Teach  | identifier
     2 | Chicago Academy of Sciences Mollusc Collection                                                     | https://arctos.database.museum/collection/CHAS:Inv      | CHAS:Teach  | identifier
     1 | Museum of Southwestern Biology, Division of Mammals                                                | https://arctos.database.museum/collection/MSB:Mamm      | DGR:Mamm    | identifier
     2 | Denver Museum of Nature and Science Parasite Collection                                            | https://arctos.database.museum/collection/DMNS:Para     | DMNS:Bird   | institutional catalog number
    10 | Denver Museum of Nature and Science Marine Invertebrate Collection                                 | https://arctos.database.museum/collection/DMNS:Inv      | DMNS:Inv    | identifier
    37 | Denver Museum of Nature and Science Mammal Collection                                              | https://arctos.database.museum/collection/DMNS:Mamm     | DMNS:Mamm   | DZTM: Denver Zoology Tissue Mammal
     1 | Denver Museum of Nature and Science Mammal Collection                                              | https://arctos.database.museum/collection/DMNS:Mamm     | DMNS:Mamm   | institutional catalog number
     1 | Denver Museum of Nature and Science Parasite Collection                                            | https://arctos.database.museum/collection/DMNS:Para     | DMNS:Mamm   | institutional catalog number
     1 | Museum of Southwestern Biology, Division of Mammals                                                | https://arctos.database.museum/collection/MSB:Mamm      | DMNS:Mamm   | institutional catalog number
     8 | Denver Museum of Nature and Science Mammal Collection                                              | https://arctos.database.museum/collection/DMNS:Mamm     | DMNS:Para   | identifier
     4 | Denver Museum of Nature and Science Parasite Collection                                            | https://arctos.database.museum/collection/DMNS:Para     | DMNS:Para   | identifier
     4 | Kenai National Wildlife Refuge, Alaska Insect Collection                                           | https://arctos.database.museum/collection/KNWR:Ento     | KNWR:Env    | identifier
   635 | Kansas State University Biorepository Mammal Collection                                            | https://arctos.database.museum/collection/KSB:Mamm      | KSB:Mamm    | identifier
     1 | Museum of Southwestern Biology, Division of Mammals                                                | https://arctos.database.museum/collection/MSB:Mamm      | KSB:Mamm    | identifier
  1184 | Kansas State University Biorepository Teaching Collection                                          | https://arctos.database.museum/collection/KSB:Teach     | KSB:Teach   | identifier
   110 | Bell Museum of Natural History Bird Collection                                                     | https://arctos.database.museum/collection/MMNH:Bird     | MMNH:Bird   | preparator number
   469 | Bell Museum of Natural History Education Collection                                                | https://arctos.database.museum/collection/MMNH:Edu      | MMNH:Mamm   | institutional catalog number
     1 | Bell Museum of Natural History Mammal Collection                                                   | https://arctos.database.museum/collection/MMNH:Mamm     | MMNH:Mamm   | institutional catalog number
     6 | Museum of Southwestern Biology, Divison of Birds                                                   | https://arctos.database.museum/collection/MSB:Bird      | MSB:Bird    | identifier
     3 | Museum of Southwestern Biology, Division of Genomic Resources                                      | https://arctos.database.museum/collection/MSB:DGR       | MSB:Bird    | identifier
     4 | Museum of Southwestern Biology, Division of Genomic Resources                                      | https://arctos.database.museum/collection/MSB:DGR       | MSB:Bird    | NK
     1 | University of Alaska Museum Bird Collection                                                        | https://arctos.database.museum/collection/UAM:Bird      | MSB:Bird    | identifier
    68 | Museum of Southwestern Biology, Division of Genomic Resources                                      | https://arctos.database.museum/collection/MSB:DGR       | MSB:Fish    | identifier
     2 | Museum of Southwestern Biology, Division of Fishes                                                 | https://arctos.database.museum/collection/MSB:Fish      | MSB:Fish    | identifier
     2 | Museum of Southwestern Biology, Division of Amphibians and Reptiles                                | https://arctos.database.museum/collection/MSB:Herp      | MSB:Herp    | identifier
     3 | Museum of Southwestern Biology Host (of parasite) Collection                                       | https://arctos.database.museum/collection/MSB:Host      | MSB:Host    | identifier
     4 | Museum of Southwestern Biology, Division of Mammals                                                | https://arctos.database.museum/collection/MSB:Mamm      | MSB:Host    | identifier
    19 | Museum of Southwestern Biology, Division of Parasites                                              | https://arctos.database.museum/collection/MSB:Para      | MSB:Host    | identifier
    37 | Museum of Southwestern Biology, Division of Parasites                                              | https://arctos.database.museum/collection/MSB:Para      | MSB:Host    | institutional catalog number
     1 | U. S. National Parasite Collection                                                                 | https://arctos.database.museum/collection/USNPC:Para    | MSB:Host    | identifier
    24 | Arctos Entity Collection                                                                           | https://arctos.database.museum/collection/Arctos:Entity | MSB:Mamm    | Organism ID
    69 | Harold W. Manter Laboratory of Parasitology                                                        | https://arctos.database.museum/collection/HWML:Para     | MSB:Mamm    | identifier
     2 | Harold W. Manter Laboratory of Parasitology                                                        | https://arctos.database.museum/collection/HWML:Para     | MSB:Mamm    | institutional catalog number
     1 | Museum of Southwestern Biology, Division of Genomic Resources                                      | https://arctos.database.museum/collection/MSB:DGR       | MSB:Mamm    | DGR: Division of Genomic Resources (MSB)
    60 | Museum of Southwestern Biology, Division of Genomic Resources                                      | https://arctos.database.museum/collection/MSB:DGR       | MSB:Mamm    | identifier
    48 | Museum of Southwestern Biology, Division of Mammals                                                | https://arctos.database.museum/collection/MSB:Mamm      | MSB:Mamm    | identifier
     2 | Museum of Southwestern Biology, Division of Mammals                                                | https://arctos.database.museum/collection/MSB:Mamm      | MSB:Mamm    | institutional catalog number
     1 | Museum of Southwestern Biology Mammal Observations Collection                                      | https://arctos.database.museum/collection/MSBObs:Mamm   | MSB:Mamm    | identifier
     7 | Museum of Southwestern Biology, Division of Parasites                                              | https://arctos.database.museum/collection/MSB:Para      | MSB:Mamm    | identifier
     6 | Museum of Southwestern Biology, Divison of Birds                                                   | https://arctos.database.museum/collection/MSB:Bird      | MSB:Para    | institutional catalog number
     2 | Museum of Southwestern Biology, Division of Genomic Resources                                      | https://arctos.database.museum/collection/MSB:DGR       | MSB:Para    | identifier
     3 | Museum of Southwestern Biology, Division of Fishes                                                 | https://arctos.database.museum/collection/MSB:Fish      | MSB:Para    | identifier
    33 | Museum of Southwestern Biology, Division of Amphibians and Reptiles                                | https://arctos.database.museum/collection/MSB:Herp      | MSB:Para    | institutional catalog number
    24 | Museum of Southwestern Biology Host (of parasite) Collection                                       | https://arctos.database.museum/collection/MSB:Host      | MSB:Para    | identifier
   134 | Museum of Southwestern Biology Host (of parasite) Collection                                       | https://arctos.database.museum/collection/MSB:Host      | MSB:Para    | institutional catalog number
  2163 | Museum of Southwestern Biology, Division of Mammals                                                | https://arctos.database.museum/collection/MSB:Mamm      | MSB:Para    | institutional catalog number
    31 | Museum of Southwestern Biology, Division of Parasites                                              | https://arctos.database.museum/collection/MSB:Para      | MSB:Para    | identifier
    37 | Museum of Southwestern Biology, Division of Parasites                                              | https://arctos.database.museum/collection/MSB:Para      | MSB:Para    | institutional catalog number
   154 | University of Alaska Museum Insect Collection                                                      | https://arctos.database.museum/collection/UAM:Ento      | MSB:Para    | identifier
   850 | University of Alaska Museum Mammal Collection                                                      | https://arctos.database.museum/collection/UAM:Mamm      | MSB:Para    | institutional catalog number
     4 | U. S. National Parasite Collection                                                                 | https://arctos.database.museum/collection/USNPC:Para    | MSB:Para    | identifier
     6 | U. S. National Parasite Collection                                                                 | https://arctos.database.museum/collection/USNPC:Para    | MSB:Para    | institutional catalog number
     1 | Arctos Entity Collection                                                                           | https://arctos.database.museum/collection/Arctos:Entity | MVZ:Bird    | Organism ID
     4 | Harold W. Manter Laboratory of Parasitology                                                        | https://arctos.database.museum/collection/HWML:Para     | MVZ:Herp    | identifier
    80 | Harold W. Manter Laboratory of Parasitology                                                        | https://arctos.database.museum/collection/HWML:Para     | MVZ:Mamm    | identifier
     3 | MVZ Hildebrand Collection                                                                          | https://arctos.database.museum/collection/MVZ:Hild      | MVZ:Mamm    | identifier
     2 | Natural History Society of Maryland Archaeology Collection                                         | https://arctos.database.museum/collection/NHSM:Arc      | NHSM:Arc    | identifier
     2 | New Mexico Museum of Natural History and Science Paleontology Collection                           | https://arctos.database.museum/collection/NMMNH:Paleo   | NMMNH:Paleo | identifier
     6 | Northern Michigan University Mammal Collection                                                     | https://arctos.database.museum/collection/NMU:Mamm      | NMU:Mamm    | identifier
     8 | Northern Michigan University Parasite Collection                                                   | https://arctos.database.museum/collection/NMU:Para      | NMU:Mamm    | identifier
 35734 | Ocean Genome Legacy Genomics Collection                                                            | https://arctos.database.museum/collection/OGL:Genomic   | OGL:Genomic | identifier
 33809 | Ocean Genome Legacy Genomics Collection                                                            | https://arctos.database.museum/collection/OGL:Genomic   | OGL:Genomic | lot number
  1032 | Ocean Genome Legacy Genomics Collection                                                            | https://arctos.database.museum/collection/OGL:Genomic   | OGL:Genomic | processing number
     9 | Museum of Southwestern Biology, Divison of Birds                                                   | https://arctos.database.museum/collection/MSB:Bird      | OWU:Bird    | institutional catalog number
     4 | Ohio Wesleyan University Fish Collection                                                           | https://arctos.database.museum/collection/OWU:Fish      | OWU:Fish    | identifier
     1 | Ohio Wesleyan University Invertebrate Collection                                                   | https://arctos.database.museum/collection/OWU:Inv       | OWU:Inv     | identifier
     1 | Trinity College Dublin Geological Museum Paleontology Collection                                   | https://arctos.database.museum/collection/TCDGM:Paleo   | TCDGM:Paleo | identifier
    38 | University of Alaska Museum Bird Collection                                                        | https://arctos.database.museum/collection/UAM:Bird      | UAM:Bird    | identifier
    14 | University of Alaska Museum Ethnology and History Department                                       | https://arctos.database.museum/collection/UAM:EH        | UAM:EH      | identifier
     1 | University of Alaska Museum Mammal Collection                                                      | https://arctos.database.museum/collection/UAM:Mamm      | UAM:EH      | identifier
     4 | Kenai National Wildlife Refuge, Alaska Insect Collection                                           | https://arctos.database.museum/collection/KNWR:Ento     | UAM:Ento    | identifier
     1 | Kenai National Wildlife Refuge, Alaska Environmental Samples Collection                            | https://arctos.database.museum/collection/KNWR:Env      | UAM:Ento    | identifier
    55 | Kenelm W. Philip Lepidoptera Collection                                                            | https://arctos.database.museum/collection/KWP:Ento      | UAM:Ento    | identifier
     1 | University of Alaska Museum Bird Collection                                                        | https://arctos.database.museum/collection/UAM:Bird      | UAM:Ento    | identifier
    75 | University of Alaska Museum Insect Collection                                                      | https://arctos.database.museum/collection/UAM:Ento      | UAM:Ento    | identifier
     3 | University of Alaska Museum Insect Observations Collection                                         | https://arctos.database.museum/collection/UAMObs:Ento   | UAM:Ento    | identifier
     2 | University of Alaska Museum Mammal Observations Collection                                         | https://arctos.database.museum/collection/UAMObs:Mamm   | UAM:Ento    | identifier
     2 | University of Alaska Museum Herbarium                                                              | https://arctos.database.museum/collection/UAM:Herb      | UAM:Herb    | identifier
     2 | University of Alaska Museum Insect Collection                                                      | https://arctos.database.museum/collection/UAM:Ento      | UAM:Inv     | identifier
   132 | University of Alaska Museum Insect Collection                                                      | https://arctos.database.museum/collection/UAM:Ento      | UAM:Mamm    | identifier
     1 | University of Alaska Museum Mammal Observations Collection                                         | https://arctos.database.museum/collection/UAMObs:Mamm   | UAM:Mamm    | identifier
     2 | University of Colorado Museum of Natural History Bird Collection                                   | https://arctos.database.museum/collection/UCM:Bird      | UCM:Bird    | identifier
     1 | University of Colorado Museum of Natural History Amphibian and Reptile Collection                  | https://arctos.database.museum/collection/UCM:Herp      | UCM:Herp    | identifier
    19 | University of Colorado Museum of Natural History Mammal Collection                                 | https://arctos.database.museum/collection/UCM:Mamm      | UCM:Mamm    | identifier
     2 | Harold W. Manter Laboratory of Parasitology                                                        | https://arctos.database.museum/collection/HWML:Para     | UMZM:Bird   | identifier
    19 | University of Montana Philip L. Wright Zoological Museum Bird Collection                           | https://arctos.database.museum/collection/UMZM:Bird     | UMZM:Bird   | preparator number
     2 | University of Montana Philip L. Wright Zoological Museum Mammal Collection                         | https://arctos.database.museum/collection/UMZM:Mamm     | UMZM:Bird   | preparator number
     1 | University of Montana Philip L. Wright Zoological Museum Bird Collection                           | https://arctos.database.museum/collection/UMZM:Bird     | UMZM:Egg    | identifier
    26 | University of Montana Philip L. Wright Zoological Museum Bird Collection                           | https://arctos.database.museum/collection/UMZM:Bird     | UMZM:Egg    | institutional catalog number
     1 | University of Montana Philip L. Wright Zoological Museum Mammal Collection                         | https://arctos.database.museum/collection/UMZM:Mamm     | UMZM:Egg    | institutional catalog number
     2 | University of Montana Philip L. Wright Zoological Museum Bird Collection                           | https://arctos.database.museum/collection/UMZM:Bird     | UMZM:Mamm   | preparator number
     1 | University of Montana Philip L. Wright Zoological Museum Mammal Collection                         | https://arctos.database.museum/collection/UMZM:Mamm     | UMZM:Mamm   | preparator number
  3397 | University of New Mexico Geology Collection                                                        | https://arctos.database.museum/collection/UNM:Geol      | UNM:Geol    | identifier
    94 | University of Texas at El Paso Biodiversity Collections Earth Science Collection                   | https://arctos.database.museum/collection/UTEP:ES       | UTEP:ES     | identifier
     1 | University of Texas at El Paso Biodiversity Collections Herbarium                                  | https://arctos.database.museum/collection/UTEP:Herb     | UTEP:Herb   | identifier
     1 | University of Texas at El Paso Biodiversity Collections Amphibian and Reptile Collection           | https://arctos.database.museum/collection/UTEP:Herp     | UTEP:Herp   | identifier
     1 | University of Texas at El Paso Biodiversity Collections Amphibian and Reptile Osteology Collection | https://arctos.database.museum/collection/UTEP:HerpOS   | UTEP:Herp   | identifier
     1 | Burke Museum Amphibian and Reptile Collection                                                      | https://arctos.database.museum/collection/UWBM:Herp     | UWBM:Herp   | identifier
     1 | Burke Museum Mammal Collection                                                                     | https://arctos.database.museum/collection/UWBM:Mamm     | UWBM:Herp   | identifier
    48 | Burke Museum Invertebrate Paleontology Collection                                                  | https://arctos.database.museum/collection/UWBM:IP       | UWBM:VP     | identifier
     1 | Burke Museum Mammal Collection                                                                     | https://arctos.database.museum/collection/UWBM:Mamm     | UWBM:VP     | identifier
     4 | University of Wyoming Museum of Vertebrates Bird Collection                                        | https://arctos.database.museum/collection/UWYMV:Bird    | UWYMV:Egg   | identifier
     2 | University of Wyoming Museum of Vertebrates Amphibian and Reptile Collection                       | https://arctos.database.museum/collection/UWYMV:Herp    | UWYMV:Herp  | institutional catalog number
     2 | University of Wyoming Museum of Vertebrates Mammal Collection                                      | https://arctos.database.museum/collection/UWYMV:Mamm    | UWYMV:Mamm  | identifier
  1424 | Museo de Zoología de la Universidad San Francisco de Quito Amphibian and Reptile Collection        | https://arctos.database.museum/collection/ZSFQ:Herp     | ZSFQ:Herp   | identifier

That should be cleaned up (I don't know how, probably won't be fun) and prevented (either as part of doing something more-formal with the agent/collection link, or I could do it as part of creating collections) if anyone is willing to deal with good and precise data; I'm having doubts at the moment.

Jegelewicz commented 4 months ago

That should be cleaned up

I don't understand how to find those or what the actual problems are? Can you give me more information like what are the records these problems are associated with?

Jegelewicz commented 4 months ago

We must understand that these collection agents have issued things that are not Arctos record GUIDs and they should be able to be recorded as such. This includes things like old catalog numbers. All of these relationships should be self and the type should NOT be Arctos record GUID, so no problem?

dustymc commented 4 months ago

collection agents have issued things that are not Arctos record GUIDs

If that's the case then the data are as good as anyone cares to make them and nothing else is required.

FWIW I don't think that's OK; https://arctos-test.tacc.utexas.edu/agent/21346749 is not capable of preparing (https://arctos-test.tacc.utexas.edu/guid/MMNH:Bird:51088), https://arctos-test.tacc.utexas.edu/agent/21347747 exists to issue https://arctos-test.tacc.utexas.edu/info/ctDocumentation.cfm?table=ctcoll_other_id_type#dztm__denver_zoology_tissue_mammal and https://arctos-test.tacc.utexas.edu/guid/DMNS:Mamm:14999 is a loss of information (caused by us being inevitable stuck in this weird limbo), etc. I can't see any reason collections should issue anything other than catalog numbers, but definitely not a hill I'm willing to die on if nobody cares. (Does make me wonder why we're here though....)

Jegelewicz commented 4 months ago

I can't see any reason collections should issue anything other than catalog numbers

Some of these are catalog numbers that have been replaced with new Arctos record GUIDs.