Closed Jegelewicz closed 2 years ago
@acdoll @sharpphyl you may want to weigh in.
No real objections, but the documentation would need to be clear on what this is (some random thing that's never going to get updated?!) and what it can do (within Arctos: nothing that I can see).
We are definitely in favor of this.
what it can do
Currently, the number of individual organisms in a lot is captured in 'lot count' - this is not passed on to the aggregators (nor should it be; per documentation lot count can describe the number of vertebrae in a box). E.g., https://arctos.database.museum/guid/DMNS:Inv:10020 has two shells in the lot (i.e. 2 individuals). But the GBIF record only reports 1 individual:
I agree it's a useful concept, I just don't think this is a suitable place for it.
and/or
I don't think either one of those scenarios are approachable by themselves, much less in the combinations that would come to exist in an active collection.
I think this would be much better as a catalog record attribute, even if that's not fully capable of dealing with the data in some fringe cases. (And it's pretty easy to avoid those situations if this kind of information is important.)
record has 17 events for some reason (eg really great georeferencing history) you find another individual hiding in the back of the drawer, so you need to go update all 17 events
this might get used here - #4033
Given that events end up as occurrences, I think this makes sense here. it is either that or as part of "specimen event". It is NOT in any way related to what parts are currently in or have been in the collection right now.
I think this would be much better as a catalog record attribute, even if that's not fully capable of dealing with the data in some fringe cases. (And it's pretty easy to avoid those situations if this kind of information is important.)
No because some records include actual distinct events that may or may not be about the same number of individuals.
you have 20 lots from the same space/time you have to manage 20 events because the lots all contain a different number of individuals
If you have 20 lots with a different number of individuals from the same event then they all participated in the event and adding one count of individuals that is the sum of all the lot counts to that event should suffice?
The thing is - no one HAS to use this and if it isn't there, we can just pass "1" as a default to dwc:individualCount. That seems potentially less worse than what we are doing now?
some records include actual distinct events that may or may not be about the same number of individuals.
I'm not sure that anyone who's dumping stuff into a lot over time is going to much care about this....
pass "1" as a default...less worse
"We don't have that information" is kinda always a defensible position. "... and so we've made something up!", not so much.
But we are making stuff up now!
Andy has already described the problem for our collection. We do not have multiple collecting events in one record, so I can't speak to that. The difference between one individual and two individuals that Andy pointed out could be meaningful to a researcher. A stronger case can be made for micromollusks which can occur in large numbers which could be important to assess the health of the population, etc. DMNS:Inv:29549 of Caecum bipartitum has 276 shells (in a tiny gel cap).
GBIF shows one individual.
As long as the data flows to GBIF as "Individual Count" it doesn't matter to me where I put the number of specimens in the Arctos catalog record.
We support Teresa's recommendation to include DWC Individual Count so that the aggregator records reflect the number of individuals found at that collecting event.
@Jegelewicz Just checking when the Code Table Management team will meet to discuss this. I don't want this issue to drift into oblivion.
If you have 20 lots with a different number of individuals from the same event then they all participated in the event and adding one count of individuals that is the sum of all the lot counts to that event should suffice?
That is not reflective of how the data are structured.
@dustymc is there a solution you can suggest? We do need this resolved.
https://github.com/ArctosDB/arctos/issues/4032#issuecomment-949750557
catalog record attribute
conceptually count doesnt belong in the collecting event-- I agree with Dusty that it is a attribute of the cataloged record.
If it's not getting passed on in the DwCA, then that's a mapping issue, not a CT or new thing for collecting event (which is location+date:time)
If it's not getting passed on in the DwCA, then that's a mapping issue
We don't actually record this in a meaningful way anywhere, "part lot count" is not a usable value since we may have 3 parts from a single individual in a given catalog record.
conceptually count doesn't belong in the collecting event
Probably not - since multiple taxa can share a collection event, but it also does not belong as part of the catalog record either. The individual count expected at the aggregators is "The number of individuals present at the time of the Occurrence." What we are passing as "occurrences" are actually "specimen" events (please see https://github.com/ArctosDB/arctos/issues/4036 because our terminology is all over the place and is also problematic). As discussed recently, using "specimen" events as an occurrence is problematic because we end up reporting two occurrences when there is only one. Here is an example:
https://arctos.database.museum/guid/DMNS:Mamm:12344 is from the same individual/collection event as https://arctos.database.museum/guid/MSB:Mamm:233616
BUT they are passed to the aggregators as separate events/individuals
https://www.gbif.org/occurrence/1145096812 and https://www.gbif.org/occurrence/1145267756
Careful consideration of associated occurrences and organism ID will suss this out, but it is a shame that we pass different organism IDs for each of these records. Even if we cleaned up our act and got them into the same collecting event, we would still be sending conflicting information.
Anyhoo. It is probably true that we have no good way to say how many individuals of a particular taxon took part in any given OCCURRENCE (collecting or observation event). Ideas are welcome because sending 1 when there are 276 is a bit misleading.
We don't actually record this in a meaningful way anywhere,
Correct - I magic it (poorly, probably) for some special circumstances, and there's some legacy not-quite-data from previous attempts of that hanging around. If we want to pass something meaningful on then we need to record it. (And I can magic - probably still poorly! - the initial values if needed.)
What we are passing as "occurrences" are actually "specimen" events
No, we are splitting catalog records at collecting events in an attempt to magick Occurrences out of the aether. What we are passing as Occurrences does not exist in Arctos; that's just not what gets cataloged.
What we are passing as Occurrences does not exist in Arctos; that's just not what gets cataloged.
Mostly - but I think some records with observation type events are pretty close.
we are splitting catalog records at collecting events
I think we are splitting them at "specimen" events - thus the seid?
Honestly the quoted statement is true for all physical collections in the data aggregators, but after looking at this, I do think there are some things we could be doing better.
So I guess I can go along with making this a collection object attribute even though it isn't really going to solve the whole problem. See updated request.
some records ... are pretty close.
Most are.
them at "specimen" events
Same thing from the perspective of a single catalog record.
think there are some things we could be doing better
Always.
isn't really going to solve the whole problem
Nope, there are some ragged edges, but I think it does what the collections who seem to care about this want done without adding too much complexity or being too hard to understand in a decade or so.
The number of individuals represented by this catalog record.
That doesn't seem quite right, or complete, or something, but I'm struggling to come up with anything better. @sharpphyl help??
I fully support adding this as a collection object attribute. Is there some way we can represent count = unknown in a way that GBIF would ingest correctly?
@dustymc how does "INDIVIDUALCOUNT" get calculated?
INDIVIDUALCOUNT individualCount,
OK - so we basically pass 1 for everything except Fish collections?
No, I'm not sure where you're seeing that? (And this issue exists because whatever we're doing doesn't work so I'm not sure why it matters?)
MSB Para has lot count, and I use this field; default value of unknown = 1, but that's not great. I would hope we are passing values to GBIF if they differ from 1.
On Tue, Nov 9, 2021 at 9:28 AM Teresa Mayfield-Meyer < @.***> wrote:
- [EXTERNAL]*
OK - so we basically pass 1 for everything except Fish collections?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ArctosDB/arctos/issues/4032#issuecomment-964317432, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADQ7JBD7EPKLQEGZ3PRZ4DLULFD2ZANCNFSM5GQXX5GQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
I'm not sure where you're seeing that?
So All I know is this:
for fish, INDIVIDUALCOUNT is sum of select part's lot count
update flat set (INDIVIDUALCOUNT)=(
select
sum(lot_count)
from
specimen_part
inner join coll_object on specimen_part.collection_object_id=coll_object.collection_object_id
where
part_name like '%whole%' and
coll_obj_disposition not in ('discarded','used up','deaccessioned','missing','transfer of custody') and
specimen_part.derived_from_cat_item=cid
) where collection_cde='Fish' and collection_object_id=cid;
What do we do for everything else?
I'm struggling to come up with anything better. @sharpphyl help??
I know our records and individual counts are quite simple in comparison to other collections. For us the number of individuals in the catalog record and the number of individuals in the collecting event are the same. Some records do have multiple parts (shell and opercula) and each part is entered in Parts and the count in Qty. For our purposes, the location of a new attribute for the number of individuals represented in the record could be part of the occurrence or the catalog record as long as it maps to dwc:individualCount in GBIF and other aggregators.
Am I answering the right question, @dustymc ?
I'm looking for an Arctos definition - "The number of individuals represented by this catalog record." is the current winner yet seems somehow lacking.
For us the number of individuals in the catalog record and the number of individuals in the collecting event are the same.
Nope....
part of the occurrence
For the sake of clarity: There is no such thing in Arctos.
I think that the number of individuals represented by this catalog record is the only defensible definition if this is a catalog record attribute.
only defensible definition
But that's not at all in line with the DWC definition ("there were 10,000 individuals present at the time of the Occurrence, we caught three"), nor what the only user of this in Arctos has done ("...and two of those have since been used up, so 1").
https://arctos.database.museum/info/ctDocumentation.cfm?table=ctattribute_type#abundance seems to fit the DWC idea, but DWC sees to want an incompatible datatype.
But that's not at all in line with the DWC definition ("there were 10,000 individuals present at the time of the Occurrence, we caught three"),
No argument from me there - but we don't really send occurrences do we?
nor what the only user of this in Arctos has done ("...and two of those have since been used up, so 1").
That is part of what this term is meant to correct - so even if a part is removed or "used up" the original number from our interpretation of "occurrence" is still there.
We don't have to send anything if we don't know or aren't sure.
I like the suggestion in the tdwg thread to use organism quantity and quantity type. https://dwc.tdwg.org/list/#dwc_organismQuantity
@Nicole-Ridgwell-NMMNHS I have been searching all over for that comment! How would you suggest we implement the organism/type thing?
Also leaving the link to the comment here - https://github.com/tdwg/dwc/issues/285#issuecomment-965733231
Also, for some reason the links to the main DwC site never seem to work - so here for reference
https://dwc.tdwg.org/terms/#dwc:organismQuantity
A number or enumeration value for the quantity of organisms.
https://dwc.tdwg.org/terms/#dwc:organismQuantityType
The type of quantification system used for the quantity of organisms.
Examples: 27 (organismQuantity) with individuals (organismQuantityType). 12.5 (organismQuantity) with %biomass (organismQuantityType). r (organismQuantity) with BraunBlanquetScale (organismQuantityType).
If organism quantity is added as a specimen attribute, could we map quantity type to attribute units and create a new units table?
And attribute remark could help differentiate when the identification on a record is A and B.
I was looking at changing this issue or creating a new one, but I feel like what we have here is fine, except instead of passing the value in this term to dwc:individualCount
, we would pass it to dwc:organismQuantity
and we would pass the units value to dwc:organismQuantityType
Does this sound like a good solution?
If we want to call it individual count, would we default the quantity type/units value to individuals?
Organism quantity seems like a more flexible term - do we need that flexibility or are things like %biomass covered by other attributes?
new units table
https://arctos.database.museum/info/ctDocumentation.cfm?table=ctcount_units
default
Nothing currently has a default, but "just UI" (ish, probably).
other attributes
https://github.com/ArctosDB/arctos/issues/4032#issuecomment-965381448 - if those aren't the same, the definitions would need to reflect the differences.
instead of passing the value in this term to dwc:individualCount, we would pass it to dwc:organismQuantity and we would pass the units value to dwc:organismQuantityType
I'm not sure what the difference is between individual count and organism quantity, but if that solves the problem for our invert collection and makes sense for everyone else, I'm fine with it. Thanks!
I think that the number of individuals represented by this catalog record is the only defensible definition if this is a catalog record attribute.
Yes, we have no idea how many other individuals were present at the collecting event. We only know how many (shells) were collected for our catalog record.
instead of passing the value in this term to dwc:individualCount, we would pass it to dwc:organismQuantity and we would pass the units value to dwc:organismQuantityType
Is this something that can be done for an individual (probably invertebrate) collection or are we waiting for AWG approval or an update or something else? I'd like to take action so that the number of organisms represented in our catalog records (and recorded under Qty) shows up in GBIF instead of always being 1.
What do we need to do to keep this issue moving forward?
keep this issue moving forward
Just a focused discussion of how to store the data. Here's a shot, mostly copy-pasta of the edited original, please edit/replace/whatever as necessary.
Goal Record number of individuals cataloged.
Context
Table https://arctos.database.museum/info/ctDocumentation.cfm?table=ctattribute_type
Value individual count
Definition The number of individuals represented by the catalog record.
Attribute data type number+units
Attribute value integers
Attribute units ctcount_units
(Then, we can talk about DWC Mapping, but we should also make sure the data we set up are compatible with the target so a little chicken-n-eggy, I'm still thinking https://dwc.tdwg.org/terms/#dwc:individualCount.)
@dustymc So you would add ctcount_units as an attribute in the code table code table attribute_type for Inv and other collections as needed. For example, we would enter 5 as the Attribute value and select "individuals" as the Attribute units.
Would you be able to magic the number currently in the Qty field into this new attribute? It sounds like we would still have the Qty field for the parts (shell, operculum, etc.). We do have some records with a different quantity of two parts which we could enter manually.
As for the DWC mapping, my only concern is that it map to GBIF and iDigBio "Individual Count" which is the field that currently always shows 1.
Sounds good for our collection. @acdoll Any concerns?
@Jegelewicz Will the Code Table Committee be able to take up this topic at their next meeting?
Table https://arctos.database.museum/info/ctDocumentation.cfm?table=ctattribute_type
Value individual count
Definition The number of individuals represented by the catalog record.
Attribute data type number+units
Attribute value integers
Attribute units ctcount_units
Definition The number of individual organisms represented by the catalog record.
Above from AWG Issues meeting. Are we ready to add this with the definition above or is there still "lot" confusion? @acdoll @mkoo @sharpphyl @ccicero @campmlc
Sounds good to me.
Code Table Committee says add!
Need to map to individualCount for aggregators.
Added to code table.
Goal Accurately describe the number of individuals that participated in an occurrence per dwc:individualCount in order to pass appropriate information to aggregators.
Context https://github.com/ArctosDB/arctos/issues/3908#issuecomment-949698521
Table https://arctos.database.museum/info/ctDocumentation.cfm?table=ctattribute_type
Value individual count
Definition The number of individuals represented by this catalog record.
Attribute data type number+units
Attribute value integers
Attribute units individuals
Priority [ Please choose a priority-label to the right. ]