ArctosDB / arctos

Arctos is a museum collections management system
https://arctos.database.museum
60 stars 13 forks source link

Code Table Request - New attribute: individual count #4032

Closed Jegelewicz closed 2 years ago

Jegelewicz commented 3 years ago

Goal Accurately describe the number of individuals that participated in an occurrence per dwc:individualCount in order to pass appropriate information to aggregators.

Context https://github.com/ArctosDB/arctos/issues/3908#issuecomment-949698521

Table https://arctos.database.museum/info/ctDocumentation.cfm?table=ctattribute_type

Value individual count

Definition The number of individuals represented by this catalog record.

Attribute data type number+units

Attribute value integers

Attribute units individuals

Priority [ Please choose a priority-label to the right. ]

Jegelewicz commented 2 years ago

Added ctcount_units

dustymc commented 2 years ago

This should be caching properly, and filtering out to things that use the cache, now.

Weird data - eg, multiple determinations - will break the cache, so there's a status on /guid/ pages in next release.

Screen Shot 2022-02-17 at 3 47 06 PM

The cache is just pulling the attribute values, so using anything other than 'individuals' for units, or any non-integer value with any units, will also do something interestingly fatal.

There is no default; not providing this attribute will result in individualcount=NULL being send out with DWC.

I can help bulkload initial data if necessary, just let me know how to calculate this.

@sharpphyl your collection had nothing, if you'll let me know how to get the initial values I can magic them in.

For collection_cde=Ento collections, the old code was using max(lot_count)

@mlbowser @leet1984 @campmlc @dssikes @mvzhuang @wellerjes @terrymcglynn @lin-fred @Jegelewicz @droberts49 @jrpletch @lmtabak

For collection_cde=Fish collections, the old code was using

 sum(lot_count) 
      where 
        part_name like '%whole%' and
        coll_obj_disposition  not in ('discarded','used up','deaccessioned','missing','transfer of custody') 

@ebraker @byuherpetology @leet1984 @mkoo @ewommack @campmlc @ccicero @mvzhuang @wellerjes @lin-fred @Jegelewicz @gradyjt @jandreslopez @droberts49 @zmsch @jrpletch @lmtabak

sharpphyl commented 2 years ago

@sharpphyl your collection had nothing, if you'll let me know how to get the initial values I can magic them in.

The number of individuals in a catalog record for DMNS:Inv is in the field Qty under Parts.

Catalog Record Count

If a record has both a shell and an operculum, each part will appear separately but the number of individuals doesn't increase.

Screen Shot 2022-02-18 at 8 11 54 AM

Thanks for magicing them in.

Is this where we add the individual count attribute during data entry?

Screen Shot 2022-02-18 at 8 16 59 AM

I think we still add the count in the Qty field too so it shows as a "shell" or other part. Does that mess up anything?

dustymc commented 2 years ago

@sharpphyl I think that means you want sum(lot_count) - 1 in your first screenshot, 2 in the second?

You should definitely continue to provide lot count - it's a completely different thing (and much more important, in my view).

Yep that's one place to edit Attributes.

@Jegelewicz the frontmatter on the parts doc page seems to be mangled and it's claiming you edited - fix?

lin-fred commented 2 years ago

Thanks all for updating this. We don't have a good inventory yet of our specimens, it's all legacy numbers at the moment. I will keep this in mind for when we do an inventory!

dustymc commented 2 years ago

@sharpphyl some data for your review:


create table temp_dmnsinvic as select
    guid,
    'individual count' as attribute,
    sum(lot_count)::text as attribute_value,
    'individuals' as attribute_units,
    'Phyllis Sharp' as determiner
from
    flat
    inner join specimen_part on flat.collection_object_id=specimen_part.derived_from_cat_item
    inner join coll_object on specimen_part.collection_object_id=coll_object.collection_object_id
where
    flat.guid_prefix='DMNS:Inv'
group by
    guid
;

temp_dmnsinvic.csv.zip

Let me know if it is as expected.

sharpphyl commented 2 years ago

@sharpphyl I think that means you want sum(lot_count) - 1 in your first screenshot, 2 in the second?

No, both of these records have only one organism. The first has only the shell and the second has both the shell and its operculum. There may be a few records where these aren't the same Qty, but I can adjust them manually. So if they are the same, that's the number of individuals in the record.

I looked at your csv and found specimens (e.g. https://arctos.database.museum/guid/DMNS:Inv:25570) that show 2 organisms where there is only one. Perhaps we should only count the number of shells if there is both a shell and an operculum and they have the same Qty. If they are different, we would use the larger quantity as the number or organisms.

I checked a few records where the part name is exoskeleton, test or whole organism and I didn't find any issues.

dustymc commented 2 years ago

@sharpphyl I can't quite figure out how to interpret that.

Maybe just max (rather than sum) lotcount works for a first pass? That seems to work for the few examples so far.

Or below are your unique part name combos - maybe we can set this up as

when partaggregate='shell|whole organism' then do_some_thing when partaggregate='operculum|shell|whole organism' then do_something_else when.....

??

Note that these are determinations, you can adjust them as necessary, and both bulk loaders and unloaders are available.

               p                
--------------------------------
 shell|whole organism
 shell
 test
 operculum|shell|shell
 shell|shell
 test|whole organism
 whole organism
 shell|test
 operculum|shell|whole organism
 operculum|shell
 egg
 exoskeleton|shell
 egg case|operculum|shell
 operculum
 egg case|shell
 exoskeleton
 egg case
(17 rows)
Jegelewicz commented 2 years ago

the frontmatter on the parts doc page seems to be mangled and it's claiming you edited - fix?

Give me a pointer so I can figure out where to go look?

dustymc commented 2 years ago

There's no sidenav thingee

Jegelewicz commented 2 years ago

No sidenav thingee where?

dustymc commented 2 years ago

https://handbook.arctosdb.org/documentation/parts.html

sharpphyl commented 2 years ago

Let's see if this helps.

Rule 1 - if there is only one part, use the value in Part Qty Rule 2 - if there is a shell and an operculum, use the value in Part Qty for shell only. Do not add the operculum Qty. Rule 3 - sum certain part Qty values as listed below

shell - use Qty test - use Qty exoskeleton - use Qty egg case - use Qty egg - I changed this to egg case whole organism - use Qty operculum - use Qty

operculum|shell - use the shell Qty only operculum|shell|shell - sum the shell Qtys only operculum|shell|whole organism - sum the shell and whole organism Qtys only

egg case|operculum|shell - use the shell Qty only - I only found one record for this - https://arctos.database.museum/guid/DMNS:Inv:22493 Are there others?

shell|whole organism - sum the Qtys shell|shell - sum the Qtys test|whole organism - sum the Qtys shell|test - sum the Qtys exoskeleton|shell - sum the Qtys egg case|shell - sum the Qtys

campmlc commented 2 years ago

Apologies for not being able to follow this more closely and coming in with a stupid question, but I thought that individual count could be a field to use to record all the individuals in a lot, regardless of number of parts? And ideally could be used for other taxa, eg fish, tadpoles, parasites?

On Sat, Feb 19, 2022, 8:51 AM Phyllis Sharp @.***> wrote:

  • [EXTERNAL]*

Let's see if this helps.

Rule 1 - if there is only one part, use the value in Part Qty Rule 2 - if there is a shell and an operculum, use the value in Part Qty for "shell" as the default. Do not add the operculum Qty. Rule 3 - sum certain part Qty values as listed below

shell - use Qty test - use Qty exoskeleton - use Qty egg case - use Qty egg - I changed this to egg case whole organism - use Qty operculum - use Qty

operculum|shell - use the shell Qty only operculum|shell|shell - sum the shell Qtys only operculum|shell|whole organism - sum the shell and whole organism Qtys only egg case|operculum|shell - sum the egg case and shell Qtys only

shell|whole organism - sum the Qtys shell|shell - sum the Qtys test|whole organism - sum the Qtys shell|test - sum the Qtys exoskeleton|shell - sum the Qtys egg case|shell - sum the Qtys

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/4032#issuecomment-1046047162, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBGBX3LIN5L6QRDHD6LU364A3ANCNFSM5GQXX5GQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

dustymc commented 2 years ago

@sharpphyl how's this?

temp_dmnsinvic.csv.zip

sharpphyl commented 2 years ago

I corrected the organism count on five odd records that didn't fit my "rules." Highlighted in yellow. I also added 10 records at the very bottom uploaded since you ran this report. If it looks ok to you, let's do it.

temp_dmnsinvic-2 - PMS edits.csv

Thanks for making magic.

Jegelewicz commented 2 years ago

I thought that individual count could be a field to use to record all the individuals in a lot, regardless of number of parts? And ideally could be used for other taxa, eg fish, tadpoles, parasites?

That's exactly what it is.

leet1984 commented 2 years ago

HI Teresa,

Attached is the mammal catalog. I am sending it to you because of the misnaming of ACUNHC 1.

All the best Tom

On Mon, Feb 21, 2022 at 10:46 AM Teresa Mayfield-Meyer < @.***> wrote:

I thought that individual count could be a field to use to record all the individuals in a lot, regardless of number of parts? And ideally could be used for other taxa, eg fish, tadpoles, parasites?

That's exactly what it is.

— Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/4032#issuecomment-1047065523, or unsubscribe https://github.com/notifications/unsubscribe-auth/AVRLPTBB2OWDUNXL5PBEDLDU4JT5HANCNFSM5GQXX5GQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>