ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

Part Attribute Lot Count Code Table Request #5393

Closed campmlc closed 2 months ago

campmlc commented 1 year ago

Goal Add Part Lot Count as part attribute

Context Lot counts as part attributes can be loaded after the fact, and can provide metadata to document change over time in part attribute value as lots are split.

Table. https://arctos.database.museum/info/ctDocumentation.cfm?table=ctspecpart_attribute_type

Proposed Value Part lot count

Proposed Definition The total number of parts or individuals contained in a lot

Collection type

Attribute data type Numeric

Attribute units https://arctos.database.museum/info/ctDocumentation.cfm?table=ctcount_units - it is expected that "individuals" would be the selection?

Available for Public View Yes

Priority High, needed for bulkload @ccicero @DerekSikes

campmlc commented 1 year ago

5390

Jegelewicz commented 1 year ago

If we do this - do we also keep the "part lot count" field in parts? This leads me to believe that parts should just be:

part name, determiner, date, remark, container (barcode)

and EVERYTHING else should be an attribute, including lot count, disposition, and condition.

BUT that means at least 5 attributes (if there is one preservation) for every part in a standard bulkload file. I think this would lead to better data over time and I would be in favor, but I know this will annoy some people.

dustymc commented 1 year ago

If we do thi

We definitely won't while a critical piece of the request is missing:

Attribute units [ For number+units attributes, code table controlling units ]

I think the definition also needs a lot of work - what's the use case, why does this exist, why would I use it instead of the alternative?

I'd also like to see some sort of commitment to use, for all CT requests. We spend a lot of time creating things that don't get used, or much used, and seem like they'd have been equally useful in some remarks field.

do we also keep the "part lot count" field in parts?

I still don't think you'll convince the world that counting to almost-always-one requires that much overhead/metadata, but hopefully the previous request will clear things up.

EVERYTHING

Well @campmlc was asking this for part name the other day, so don't leave that out. (And I suspect most users most of the time are happy enough to believe femur without metadata there too.)

disposition

I sort of want to say there's value in having exactly one value here, an item is on loan or it's in the collection, it can't be both, and so having a structure that supports both can't make sense, but I've also been coming around to the idea that if 'both' isn't available users will just find some other way to do whatever crazy thing they're doing, and so maybe there's real value in allowing conflicting assertions: they can be found.

lead to better data over time

If everyone was properly trained and staffed, probably, but even then I think only for fringe-type cases. For most things "one skull" seems entirely sufficient.

ccicero commented 1 year ago

We can work on the definition etc., but my question mirrors what Teresa was saying. If we add 'part attribute lot count', then do we also need part lot count? What would go in that field? The downside of just having lot count for attribute is that not all parts have an attribute (e.g., skin, study). We have talked about moving the type of skin to preservation (study, tanned, flat, etc.) which makes sense, but only if we can upload preservation and associated metadata with the rest of the bulkloader data. But what about just plain 'skeleton' - no preservation or other attribute?

This clearly requires more discussion. Meanwhile, how should I handle the karyo part lot counts (per issue 5390)? I need to get this done asap.

Jegelewicz commented 1 year ago

sort of want to say there's value in having exactly one value here, an item is on loan or it's in the collection, it can't be both,

But it can be both over TIME. Knowing when things left and came back is good. so perhaps our dispositions need a begin and end date...

dustymc commented 1 year ago

But it can be both over TIME

See the other issue, I'm not convinced that any lots of count > 1 CAN be, much less are, adequately managed. I mean, it's better than just cataloging the planet and calling it good, I suppose, but something about pigs and lipstick.....

If you really want to manage parts, provide them the structure so that you can do so - which I believe may inevitably lead to lot_count=1.

campmlc commented 1 year ago

We currently have parts with lot counts that change over time, as lots are sorted, IDd and split for loans etc. Theoretical issues aside, we need ways to track this. I propose we add the attribute for those of us who need it to use it, and those who don't are not required to do so. That way we can track information that is currently being lost or stuck in remarks in different formats. I personally believe that all part data are assertions that need metadata, but I'm not going to force that on anyone. But please add this attribute now for those few of us that gave to deal with lots. I'm on the road on my phone, so if I missed something in the definition etc let me know so I can add it. Numeric field, no units.

campmlc commented 1 year ago

And saying lots above count =1 can't be managed in Arctos is a good way to effectively exclude fish, herp, and invert collections, which I don't believe we want to do. As someone who is managing such data, I am requesting this attribute as a means of doing so.

dustymc commented 1 year ago

can't be managed in Arctos

Indistinguishable bits can't effectively be managed by humans, and no technology can change that. This does not provide an identifier for the individual bits, which means those individual bits can't enter into relationships with other nodes, such as loans.

That doesn't mean this can't proceed, but it does need a solid definition and the additional information requested by the template before it can be evaluated.

ccicero commented 1 year ago

@dustymc @campmlc @DerekSikes @Jegelewicz Here's an attempt to finish this request so we can move forward.

Goal Add 'part attribute lot count' as a part attribute

Context Lot counts as part attributes can be loaded after the fact, and can provide metadata to document change over time as part lots are split. Part attribute lot counts also better accommodate multiple representations (part attribute type = 'representation') for a single part type, e.g., different images taken of a part.

Table. https://arctos.database.museum/info/ctDocumentation.cfm?table=ctspecpart_attribute_type

Proposed Value Part attribute lot count

Proposed Definition The total number of individual units contained in a part attribute. Examples include (1) part lots that have different attribute metadata, such as preservations done at different times and/or by different people, (2) multiple representations for a single part type, such as images representing views of the part. Part attribute lot count is different from part lot count, which refers to the total number of individuals for a specific part type such as 'whole organism' whereby all individuals share the same part attribute data (most typically used for fish, herp, and invertebrate collections).

Collection type

Attribute data type Numeric

Attribute units https://arctos.database.museum/info/ctDocumentation.cfm?table=ctcount_units - it is expected that "individuals" would be the selection

Available for Public View Yes

Priority High, needed for bulkload @ccicero @DerekSikes

dustymc commented 1 year ago

No real objections, but I still don't entirely like your definition and so I'm not sure why I'd use this, or how I'd use this in conjunction with lot count, or something.

part lot count, which refers to the total number of individuals

Definitely not. "12 mostly-interchangeable ribs" is the core intended usage.

all individuals share the same part attribute data

Different side of the same coin I think, but this seems like it would also be handy for "12 ribs" (and provides a place for 'according to you on date using fuzzy image').

ccicero commented 1 year ago

@dustymc @Jegelewicz @campmlc Returning to this issue. I think what Dusty is saying is, why do we need both? I'm not sure either. Given this thread, I'm thinking maybe it makes more sense to move lot count from parts to part attribute.

part attribute type = lot count value = numeric

Definition: The total number of individual items in a part or associated attribute. Examples include (1) the number of individual organisms collected together as a lot (typically used for herp, fish, and invertebrate collections), (2) the number of individual units contained in a part, such as 12 ribs, (3) the number of different images that are representations of a specific part.

???

campmlc commented 1 year ago

Yes. We currently can't distinguish between the number of objects in a container vs the number of total objects in a lot. The total would be the original number of lot items sampled from nature, eg as a catalog record attribute. Each part however may have some subset of this total, and that may change over time as things get moved to new containers or split into different catalog records. We need the part attribute for this, and we could do what we do now with the preservation attribute. Happy also to have a separate attribute for " original lot count" as well.

On Wed, Jan 18, 2023, 2:26 PM Carla Cicero @.***> wrote:

  • [EXTERNAL]*

@dustymc https://github.com/dustymc @Jegelewicz https://github.com/Jegelewicz @campmlc https://github.com/campmlc Returning to this issue. I think what Dusty is saying is, why do we need both? I'm not sure either. Given this thread, I'm thinking maybe it makes more sense to move lot count from parts to part attribute.

part attribute type = lot count value = numeric

Definition: The total number of individual items in a part or associated attribute. Examples include (1) the number of individual organisms collected together as a lot (typically used for herp, fish, and invertebrate collections), (2) the number of individual units contained in a part, such as 12 ribs, (3) the number of different images that are representations of a specific part.

???

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5393#issuecomment-1396107984, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBCPI6JFV4JYWBHMXLLWTBNW3ANCNFSM6AAAAAATD73LKM . You are receiving this because you were mentioned.Message ID: @.***>

campmlc commented 1 year ago

@DerekSikes

dustymc commented 1 year ago

why do we need both?

That's always a core concern, good database design demands not making users guess which of the various paths you might have taken.

makes more sense to move lot count from parts to part attribute

Works for me. Reality might be that the two states of lot count are "1" and "it's complicated" and this provides a structurally-appropriate place for explaining WHY you've done something that doesn't let you properly track individual chunks. (And lets the ~99%-of-the-time use case do nothing - unless someone wants to - which might be nice.)

total number of ...

Whatever the user selects as units.

items in a part or associated attribute.

The first yes, the second - not sure what you're trying to do, but attributes can modify only parts, not other attributes.

(3) the number of different images that are representations of a specific part.

Unrelated to attributifying count, I still think maybe you're taking a really complicated path to not-so-great data. (Or maybe you're just delaying the complication, which I do see as a valid use of lots - when someone tries to DO SOMETHING with IMG2 you'll rearrange things so that it can be identified, until then there's no reason to introduce the complexity??)

Happy also to have a separate attribute for " original lot count" as well.

Not unless its somehow functionally different than all the rest.

DerekSikes commented 1 year ago

If moving lot count to attributes solves problems then I've got nothing against it. However, I still remain perplexed at why we have attribute "individual count" as a cat record attribute and "lot count" as a part attribute, which for my collection means I have to enter the same # in two different fields (and forever keep them identical, manually).

On Wed, Jan 18, 2023 at 12:38 PM Mariel Campbell @.***> wrote:

@DerekSikes https://github.com/DerekSikes

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5393#issuecomment-1396121741, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACFNUM5CJD5PK7D3XWKTMTLWTBPGHANCNFSM6AAAAAATD73LKM . You are receiving this because you were mentioned.Message ID: @.***>

--

+++++++++++++++++++++++++++++++++++ Derek S. Sikes, Curator of Insects, Professor of Entomology University of Alaska Museum (UAM), University of Alaska Fairbanks 1962 Yukon Drive, Fairbanks, AK 99775-6960 @.*** phone: 907-474-6278 he/him/his University of Alaska Museum https://www.uaf.edu/museum/collections/ento/

Interested in Alaskan Entomology? Join the Alaska Entomological Society and / or sign up for the email listserv "Alaska Entomological Network" at http://www.akentsoc.org/contact_us

Jegelewicz commented 1 year ago

I still remain perplexed at why we have attribute "individual count" as a cat record attribute and "lot count" as a part attribute

Because in some collections (bivalves) the lot count can be twice the individual count.

dustymc commented 1 year ago

solves problems

That's the question! I'm not sure I'm seeing any of either.

lot count can be twice the individual count.

It's not that subtle, it's just completely unrelated. 987 shell fragments could be one individual, one component of multiple parts that add up to 76543 individuals, or anything in between. MAYBE that's simpler within a collection, and MAYBE I could deploy a bot to keep things synced up if so - that would need careful explaining in a dedicated Issue.

ccicero commented 1 year ago

@DerekSikes We also use attribute 'individual count' for observational records where there is no part (e.g., # birds seen on a point count).

@campmlc I don't think we need a different part attribute of original lot count. Just add a lot count with the original date/determiner, then if split create a new lot count with a different date/determiner. ???

@dustymc Definition revised again:

The total number of individual items in a part or associated part attribute. Examples include (1) the number of individual organisms collected together as a lot (typically used for herp, fish, and invertebrate collections), (2) the number of individual units contained in a part, such as 12 ribs, (3) the number of different images that are representations of a specific part.

For #3, it makes more sense to me to add one karyotype part (JLP data) with multiple different representations, vs multiple karyotype parts that each have one representation. An organism has one karyotype.

If we move lot count to part attribute, then we'll need to be able to bulkload that with the rest of the data.

@Jegelewicz Can we put this on the agenda to discuss at today's CT meeting? This requires discussion and CT committee is the most appropriate to do so.

dustymc commented 1 year ago

An organism has one karyotype.

And one liver, but that doesn't mean we refuse to track which sample went on a loan!

bulkload that with the rest of the data.

https://github.com/ArctosDB/arctos/issues/5193

Jegelewicz commented 1 year ago

or associated part attribute

This won't work? part lot count is also an attribute and if I have three other attributes and three part lot count attributes - which goes with what?

I am OK with lot count being moved to part attributes, but I don't think it is going to help with the issue @ccicero is describing.

Jegelewicz commented 1 year ago

@ccicero sorry we did not get to this in @ArctosDB/arctos-code-table-administrators today, however, I think this goes beyond code tables since we are essentially asking for a structural change in the data. I think we need some process for these kinds of asks as they happen and then get hung up (see #5120 - where we now have both a field and an attribute, creating not-so-normal data?). I am adding this to the AWG agenda, because I think this is first an issue of community process. What are the steps needed to handle these types of requests?

dustymc commented 1 year ago

What are the steps needed to handle these types of requests?

I'd say first a reality-based justification that even I can understand.

campmlc commented 1 year ago

Here is a parasite use case: we need someplace to put that " x number of this species were collected from a host". Part Lot Count or part attribute lot count doesn't work for the total lot count because 3 worms might be put in a vial with 95% ethanol, one part of one worm might be frozen for DNA, and the remaining 10 worms could be in a larger vial of 80% ethanol for morphology. For this total count, we need the catalog record attribute. But we also need a part attribute to record the count of parts or worms in a container and how this might change over time as one or more worms are withdrawn from containers, moved to a new container, eg for genetic voucher, or recataloged.

On Thu, Jan 19, 2023, 4:46 PM dustymc @.***> wrote:

  • [EXTERNAL]*

What are the steps needed to handle these types of requests?

I'd say first a reality-based justification that even I can understand.

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5393#issuecomment-1397751591, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBFZAQGJHT2BDU2EBQTWTHG6FANCNFSM6AAAAAATD73LKM . You are receiving this because you were mentioned.Message ID: @.***>

ccicero commented 1 year ago

@campmlc that is a good use case. The question is (I think?), do we also keep lot count for the part or move lot count to attributes so we don't have two ways of entering lot count. The latter gives more flexibility to accommodate the parasite use case and others like my multiple representations for a karyotype. @dustymc if we ever end up loaning a particular representation (e.g., 35 mm negative) or a particular parasite vial in Mariel's case, then we could just create a new part to add to the loan.

Before bring this to the AWG, can those of us who are interested (including @DerekSikes) meet to hash this out so we can get it resolved? Discussions like these are easier in person versus the back-and-forth on github. Work is being held up until we come to a resolution. We can then bring it to the broader AWG for discussion.

campmlc commented 1 year ago

Agree, I'd be happy to meet and discuss. My preference is to move lot count to part attributes, for the metadata over time. But that has data entry implications.

On Thu, Jan 19, 2023, 6:47 PM Carla Cicero @.***> wrote:

  • [EXTERNAL]*

@campmlc https://github.com/campmlc that is a good use case. The question is (I think?), do we also keep lot count for the part or move lot count to attributes so we don't have two ways of entering lot count. The latter gives more flexibility to accommodate the parasite use case and others like my multiple representations for a karyotype. @dustymc https://github.com/dustymc if we ever end up loaning a particular representation (e.g., 35 mm negative) or a particular parasite vial in Mariel's case, then we could just create a new part to add to the loan.

Before bring this to the AWG, can those of us who are interested (including @DerekSikes https://github.com/DerekSikes) meet to hash this out so we can get it resolved? Discussions like these are easier in person versus the back-and-forth on github. Work is being held up until we come to a resolution. We can then bring it to the broader AWG for discussion.

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5393#issuecomment-1397826368, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBF3HDU2A4FO56I5NATWTHVDDANCNFSM6AAAAAATD73LKM . You are receiving this because you were mentioned.Message ID: @.***>

DerekSikes commented 1 year ago

happy to meet to discuss

On Thu, Jan 19, 2023 at 4:52 PM Mariel Campbell @.***> wrote:

Agree, I'd be happy to meet and discuss. My preference is to move lot count to part attributes, for the metadata over time. But that has data entry implications.

On Thu, Jan 19, 2023, 6:47 PM Carla Cicero @.***> wrote:

  • [EXTERNAL]*

@campmlc https://github.com/campmlc that is a good use case. The question is (I think?), do we also keep lot count for the part or move lot count to attributes so we don't have two ways of entering lot count. The latter gives more flexibility to accommodate the parasite use case and others like my multiple representations for a karyotype. @dustymc https://github.com/dustymc if we ever end up loaning a particular representation (e.g., 35 mm negative) or a particular parasite vial in Mariel's case, then we could just create a new part to add to the loan.

Before bring this to the AWG, can those of us who are interested (including @DerekSikes https://github.com/DerekSikes) meet to hash this out so we can get it resolved? Discussions like these are easier in person versus the back-and-forth on github. Work is being held up until we come to a resolution. We can then bring it to the broader AWG for discussion.

— Reply to this email directly, view it on GitHub <https://github.com/ArctosDB/arctos/issues/5393#issuecomment-1397826368 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/ADQ7JBF3HDU2A4FO56I5NATWTHVDDANCNFSM6AAAAAATD73LKM

. You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/5393#issuecomment-1397828648, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACFNUM7FZLQFXQANIVRM7U3WTHVULANCNFSM6AAAAAATD73LKM . You are receiving this because you were mentioned.Message ID: @.***>

--

+++++++++++++++++++++++++++++++++++ Derek S. Sikes, Curator of Insects, Professor of Entomology University of Alaska Museum (UAM), University of Alaska Fairbanks 1962 Yukon Drive, Fairbanks, AK 99775-6960 @.*** phone: 907-474-6278 he/him/his University of Alaska Museum https://www.uaf.edu/museum/collections/ento/

Interested in Alaskan Entomology? Join the Alaska Entomological Society and / or sign up for the email listserv "Alaska Entomological Network" at http://www.akentsoc.org/contact_us

ccicero commented 1 year ago

The AWG is meeting next Thursday, so can we try for Monday? See if any of these times work: https://www.when2meet.com/?18393677-AArtt

ccicero commented 1 year ago

@campmlc @DerekSikes @dustymc @Jegelewicz It looks like Monday 10am PST works for Mariel, Derek, and me. Let's plan for that. I just sent you all an invitation, it's on the Arctos calendar.

ccicero commented 1 year ago

Brief summary of discussion today with @DerekSikes @dustymc @campmlc

@DerekSikes needs a way of distinguishing whether counts are estimated or not. Solution that we discussed is to add a part attribute value of 'estimated lot count' with value of yes/no. If you want to use this part attribute, select 'yes' if estimated and add relevant metadata (who/when). If the same lot is actually counted, add a new attribute with value of 'no' and those metadata. New issue added: https://github.com/ArctosDB/arctos/issues/5521

@DerekSikes has an issue with keeping lot count and 'individual count' (attribute of cataloged record) in sync. @dustymc may be able to create a bot to magic those values. Note that 'individual count' is not required.

@campmlc still wants a separate part attribute to keep track of different stages in processing parasite specimens such as splitting lots etc. It's a bit unclear why just creating a new catalog record or part doesn't work, but if we create a separate part attribute for that, call it something other than 'lot count' as that will be confusing. One term that was suggested is 'supplemental part count' but open to other suggestions ('processed part count' ?). Need a clear definition that is distinct from lot count (https://handbook.arctosdb.org/documentation/parts.html). This is mostly an issue for parasites it seems. KEEP lot count where it is.

@ccicero realized that either way, she still needs to add multiple karyotype parts, each of which has its own count and representation.

mkoo commented 1 year ago

Thanks for the summary-- can you all make the AWG meeting? We can add to the agenda for discussion as clearly this is used and means different things to different collections.

Next AWG: this Thursday 1/26 10 AK/ 11 am PT/ 12 MT/ 1 pm CT Please let me know if you are planning to attend

ebraker commented 1 year ago

2023-06-15 code table committee suggests auxiliary_count. Does that work for folks?

Jegelewicz commented 1 year ago

I think we need a summary of what was discussed. Are we still talking about a part attribute or has the placement changed to catalog record attribute?

My gut reaction is that auxiliary count (no underscores in code table terms!) seems odd and this feels like inventory count to me....

Jegelewicz commented 4 months ago

@ccicero this seems to have gone off the rails. Can you summarize in the first comment or close and start over?

dustymc commented 2 months ago

tabling as inactionable