jackba / arctos

Automatically exported from code.google.com/p/arctos
0 stars 0 forks source link

modifications to Nature of ID and Basis of Citation #515

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Distilled from a long email thread 18 Jan 2012:

rename "Basis of Citation" to "Citation/Type Status" on Specimen Search page.

Remove "type specimen" as a value from Nature of ID.

Modify Nature of ID so values represent something about HOW a specimen was 
ID'd, e.g., morphology, molecules, plumage, song, etc. Figure out what to do 
with existing values that don't fit into  new categories.

Original issue reported on code.google.com by carla...@gmail.com on 19 Jan 2012 at 1:16

GoogleCodeExporter commented 9 years ago
How do we do the last part? Read all the papers?  How WAS a type specimen IDed?

Original comment by gordon.jarrell on 19 Jan 2012 at 3:50

GoogleCodeExporter commented 9 years ago
I have no idea. We may need to have a category like "unknown" and use that for 
legacy data as well as new specimens where we don't know the nature. Actually, 
it's going to be hard to know how something was ID'd to being with. E.g., field 
id - presumably that's based on phenotype? So maybe we need categories like:

phenotype (presumably includes field IDs)
molecular data
vocalization
geographic distribution
unknown

other???

This requires discussion with the broader Arctos group.

Original comment by carla...@gmail.com on 19 Jan 2012 at 3:33

GoogleCodeExporter commented 9 years ago
There are 2509 NoID="type specimen" specimens in 590 publications. 1160 of them 
were published before 1980 and are almost certainly morphology. All of the 
plants are almost certainly morphology. Curators are likely to have an idea of 
the content of post-1980 type-declaring publications. We keep insisting this 
stuff is important, so maybe we should spend the day tracking down and reading 
the few remaining publications. We'll make the ID _sensu_ the publication, so 
it shouldn't be very hard for anyone who might care to let us know when we get 
it wrong.

There are 105 specimens of NoID="type specimen" and no citation. I have no idea 
how to handle these. Change them to "unknown" I suppose.

COLLECTION  COUNT(*)
MSB Parasites   10
MVZ Herps   61
HWML Para   6
UAM Fishes  1
UAM Invertebrates   5
UAM Herbarium Vascular Plants   16
MSB Mammals     1
DMNS Mammals    5 

If we are resolved on this, we should document it and publish a warning to no 
longer use "type specimen" as NoID to the group.

If we are to consider other values while we're here, these are my suggestions:

ID of kin-->retain
ID to species group-->unknown (There is no value here - a taxon name is a 
species group or not. Anything past that is inherent in the identification.)
erroneous citation-->retain
expert-->unknown
field-->retain
geographic distribution-->I have absolutely no idea. The circularity of museums 
using this should be far too embarrassing for anyone to use, but they do anyway.
legacy-->unknown
molecular data-->retain
published referral-->unknown
student-->unknown
taxonomic revision-->retain

Original comment by dust...@gmail.com on 19 Jan 2012 at 3:41

GoogleCodeExporter commented 9 years ago
See http://goo.gl/PSfCO

Original comment by dust...@gmail.com on 19 Jan 2012 at 5:05

GoogleCodeExporter commented 9 years ago
I still like the idea of the following sorts of categories for nature of id 
(revised a bit from above suggestion):

phenotype
genotype
karyotype
vocalization
geographic distribution
ID of kin
unknown

re: taxonomic revision - that should be based on something, right? phenotype or 
genotype or both? Can we have >1 nature of Id for a specimen, e.g., phenotype 
AND vocalization? phenotype AND genotype, etc?

re: field ID: again, that should be based on something like phenotype or 
vocalization. something character-based. Realistically, though, it would be 
hard to know the characters on which an ID is based because that's usually not 
documented in fieldnotes. 

re: geographic distribution: for breeding birds, that is the main way that we 
ID specimens to subspecies. Too hard for many to Id from phenotype since 
characters are often clinal, so we go with named subspecies and their published 
distributions, using locality data to narrow down subspecies ids. 

ID of kin-->retain
ID to species group-->unknown
erroneous citation-->retain
expert-->unknown
field-->retain ??? i think it's better to have as phenotype etc., but how do we 
deal with existing values? 
geographic distribution-->retain
legacy-->unknown
molecular data-->retain but change to "genotype"
published referral-->unknown
student-->unknown
taxonomic revision-->retain
phenotype ---> add
karyotype ---> add
vocalization ---> add

???

Original comment by carla...@gmail.com on 19 Jan 2012 at 5:05

GoogleCodeExporter commented 9 years ago
Coming to this late (thanks for issue-posting CArla); I'm fine the first part 
but this:

"Modify Nature of ID so values represent something about HOW a specimen was 
ID'd, e.g., morphology, molecules, plumage, song, etc. Figure out what to do 
with existing values that don't fit into  new categories."

doesn't address the original issues-- and it is unpractical as Gordon points 
out (no one is going to read all the papers and often it is multiple criteria 
for ID, eg. both phenotypic and morphological analyses plus some other eco 
patterns) so a simple, based on reference, works. As for the other types of ID, 
I think maybe a distinction of ID in field versus ID in lab (which may be 
morpho. or molec. or other!) is useful (that is really I believe what is at the 
heart of "expert").

Original comment by koomap...@gmail.com on 19 Jan 2012 at 5:06

GoogleCodeExporter commented 9 years ago
Spreadsheet updated. 

I do not think we want >1 type of ID for an identification. We could have a 
value "see publication" used in combination with _sensu_, or we could add 
multiple IDs varying only by NoID.

This does address the original issue by removing from NoID information which 
does not belong here.

Original comment by dust...@gmail.com on 19 Jan 2012 at 5:16

GoogleCodeExporter commented 9 years ago
What about:

ID of kin-->retain
ID to species group-->unknown
erroneous citation-->retain
expert-->unknown
field-->retain
geographic distribution-->retain
legacy-->unknown
molecular data-->retain
published referral-->unknown
student-->unknown
taxonomic revision-->retain
museum--->add ???

Peter Pyle just came to MVZ and re-identified some snipe in our collection. 
None of the above (except "museum") covers that type of ID, which is equivalent 
to "expert." 

A specimen could be identified in the field (phenotype, vocalization, behavior, 
etc.), in the museum (morphology, plumage, etc.), or with molecular data.

It still seems like we're mixing concepts - where an ID is made (field, 
museum), kind of data used for an ID (molecular data, geog distribution), 
publication (taxonomic revision)...

Maybe we need a better definition of what we mean by "Nature of ID" ???

Original comment by carla...@gmail.com on 21 Jan 2012 at 10:40

GoogleCodeExporter commented 9 years ago
http://goo.gl/PSfCO

You've described a field ID.

Are you really going to try to defend geographic distribution?

Original comment by dust...@gmail.com on 22 Jan 2012 at 3:26

GoogleCodeExporter commented 9 years ago
No, an ID in the museum comparing specimens is not the same as a field ID. In 
the case of the snipes, they were probably identified in the field as one 
thing, then Pyle came to the MVZ and compared them to others and identified 
them as something else. We need a  NoID to cover that case, which is what we 
used to think of as "expert."

Yes, we do use geographic distribution for subspecies of birds routinely. If 
they are breeding, that is often the best way - if you believe in subspecies. 

Original comment by carla...@gmail.com on 22 Jan 2012 at 4:17

GoogleCodeExporter commented 9 years ago
Did he measure something, or did he made a "determination made without access 
to specialized equipment"?

"Looks like a snipe" or even "looks a lot like THAT snipe" (which was 
identified because it looks like THAT OTHER snipe which...ooo, getting dizzy 
again!) - right? He wasn't lugging calipers or a sequencer or a 
spectroradiometer or something of the sort around? No? Then it's a field ID, a 
concept which may or may not need a better name, and certainly needs a better 
definition.

I'm not questioning the value of taxa, I'm just pointing out the insanity of 
using geographic distribution to assign things to a taxon, and then using those 
same things to define the geographic distribution of the taxon. That can't 
possibly be how it works elsewhere? Can it?

Original comment by dust...@gmail.com on 22 Jan 2012 at 4:38

GoogleCodeExporter commented 9 years ago
I think that we need to keep in mind that we're keeping track of people do, not 
would we have them do.  Lots of specimens are determined on the base of 
geography.  There's a difference between saying I caught this thing in North 
America, so it's the American subspecies, and having Dan Gibson sit down with a 
tray of 100 Asian specimens, and a tray of 100 American specimens, and saying 
it's American when it came from St. Lawrence Island.  A "museum" ID is clearly 
not the same as a field ID.  And, whether or not the term is used correctly, 
there is such a thing as an expert ID... and, it's a term people use.

Original comment by gordon.jarrell on 23 Jan 2012 at 4:18

GoogleCodeExporter commented 9 years ago
Ai-ya! (And happy Lunar New Year!)

I'm not interested in keeping track of the strange things museums do, I'm 
interested in keeping track of how specimens are identified in ways that might 
be useful to users. Maybe that includes "we blindly trusted these things with 
wings to stay where we said they're supposed to be," but I'm still going to 
make fun of such a ridiculous practice. It's precision that we claim to have 
but absolutely do not. It discourages finding the (perhaps interesting) truth. 
Why would I want to go look at those things when MVZ asserts they already know 
what they have?

There may be such a thing as expert ID, but we suck at correctly defining/using 
such a thing. Obvious experts, if you know something about their technique, are 
self-evident. It matters whether the "expert" examined the teeth under a 
dissecting scope, or saw it scurry across the floor, or was over-extended to a 
new non-familiar taxon (http://arctos.database.museum/guid/UAM:Mamm:102236). 

I'm aiming towards a list of quantifiable things, plus exactly one "it just is" 
entry to cover the vast majority of identifications which do not come about 
from a quantifiable or repeatable process. And I suppose we're stuck with 
"unknown."

What Carla described is very much a "field ID." Maybe we've got the label 
entirely wrong - it's also the thing we used to call a "student ID." I don't 
think "looks like an X" fits in the field, but that's what we're trying to say. 
It's an ID made on something like general appearance (that'll fit in the field 
- shall we?) without quantifying anything. Doesn't matter where it's made or 
who's doing it (that's why we have determiner) or if they have specimens or 
field guides or just claim to know what they're talking about, it's all the 
same non-quantifiable "field ID" pile. Anything else is going to lead to 
arbitrary decisions that just confuse users in the end.

Spreadsheet updated again.

Original comment by dust...@gmail.com on 23 Jan 2012 at 5:03

GoogleCodeExporter commented 9 years ago
I added a new sheet "Terms" to the spreadsheet. Hopefully it will help us focus 
on defining terms without the distraction of the migration path.

Original comment by dust...@gmail.com on 23 Jan 2012 at 5:43

GoogleCodeExporter commented 9 years ago
Which spreadsheet are you referring to? Can you send a link?

I agree (and what I've been saying) that how something is identified 
(morphology, plumge, vocalization, molecular data, karyotype, etc.) is what's 
important. But that's  probably not practical because it's often not 
documented. A "field" id is often more tentative and less complete than an ID 
based on comparing specimens in the museum, analyzing genes, using geographic 
distribution to ID subspecies of birds (yes, a common practice), etc. So I 
don't think "field" id is the same as an "expert" coming into the museum and 
re-identifying something based on his/her knowledge of the morphology or 
plumage or molt pattern or whatever.

Let's get a good definition of "Nature of ID" first. Then let's come up with 
terms that are informative and practical. Then let's figure out the migration 
path.

Original comment by carla...@gmail.com on 23 Jan 2012 at 9:13

GoogleCodeExporter commented 9 years ago
See comments 4 and 9 or http://goo.gl/PSfCO for the spreadsheet.

"Person X looked at this specimen in a museum and determined that it's Species 
Z. They didn't use any sort of analytical methodology, or didn't bother telling 
us what it was if they did, and you can't replicate their work or understand 
what sort of a taxon concept they might have been thinking of from these data."

"Person X looked at this specimen somewhere not in a museum and determined that 
it's Species Z. They didn't use any sort of analytical methodology, or didn't 
bother telling us what it was if they did, and you can't replicate their work 
or understand what sort of a taxon concept they might have been thinking of 
from these data."

You're saying that, to someone questioning the usefulness of an Identification 
or trying to sort out what Species Z meant to Person X, those statements have 
different value. I don't think they do in any circumstances. In fact, I'd 
submerge them both into "unknown," seeing exactly no added value in either one, 
but I'm cleverly avoiding mentioning that in a pathetic attempt to not seem 
overly radical.

You're also implying that data entry personnel could unambiguously choose one 
of "field" or "museum" ID. It's at best a subjective, non-mutually-exclusive 
distinction. (What if I had pictures of specimens? What if I have a better 
memory than you suspect? What if....)

Those things you mentioned - morphology, plumge, vocalization, molecular data, 
karyotype, etc. - do add value to an ID. I can decide to look at a specimen 
because I don't buy into the plumage-based taxonomy, or know even in the 
absence of a "sensu" publication that you're talking about some taxon concept 
that uses vocalization. Those things add value to a specimen data record, and 
arguably to the specimen itself. The rest is essentially flavors of "we said 
so," and I'm strongly in favor of limiting the non-useful options.

FYI, there are currently 1,582,344 identifications in Arctos. 4783 are 
NoID="molecular data," and the rest are not methods-based. 99.997% of our 
identifications are "we say so," or in the case of geographic distribution "it 
lived (nevermind our georeferencing error rate, and please ignore the few 
thousand pelagic chipmunks we've fixed over the years) where the thing we say 
it is is supposed to live, possibly because we earlier said that the thing 
lives there." Ai-ya!

Here's your definition: NoID is the methodology used to assign an 
Identification (http://goo.gl/HPTa9) to a cataloged item.

Original comment by dust...@gmail.com on 23 Jan 2012 at 9:53

GoogleCodeExporter commented 9 years ago
Issue 512 has been merged into this issue.

Original comment by dust...@gmail.com on 1 Feb 2012 at 5:12

GoogleCodeExporter commented 9 years ago
This is a long thread that I think needs to be discussed generally with Arctos 
group. Let's get accession terminology finalized, get moved to TACC, then 
tackle this one.

Meanwhile, I still think it's fine to get type status out of NofID.

Original comment by carla...@gmail.com on 1 Feb 2012 at 11:36