xjsachs / applecore

Automatically exported from code.google.com/p/applecore
0 stars 0 forks source link

Should catalogNumber contain a triplet? #39

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
On the codes and numbers page of the wiki, 
https://code.google.com/p/applecore/wiki/CodesAndNumbers there is a section on 
SpecimenNumber that describes the use the catalogNumber term.  

The guidance and examples seem to suggest that the catalog number should 
include the Herbarium Acronym, the subcollection or catalog number series if 
such is needed to disambguate numbers within a herbarium, and the actual 
catalog number of the specimen itself.  

The discussion of subcollections, however provides two examples that appear to 
use only the subcollection and the catalog number, and not include the 
herbarium acronym.  

I'd suggest adding annother example to the pair at the bottom of the page which 
includes a subcollection, and clarifying the discussion of subcollections (I'd 
suggest doing so by adding a herbarium acronym to the two examples, BRYO-10001 
and VASC-10001, changing them to MT-BRYO-10001 and  MT-VASC-10001.

Original issue reported on code.google.com by mole@morris.net on 2 Feb 2012 at 7:45

GoogleCodeExporter commented 8 years ago
You're right, the page is not completely consistent. And I have a hard time 
summarizing my thoughts about it.

1) "In combination with the collectionCode it [the catalogNumber] ideally 
creates a unique identifier for each specimen record in the dataset." I think 
this sentence should be changed, as we want the catalogNumber to be a unique 
identifier on its own, within the dataset.

2) Adding or not adding the herbarium acronym will not disambiguate numbers 
within that herbarium. MT-10001 is as unique as 10001. Of course, once you 
start aggregating datasets or when their are multiple herbaria combined in one 
dataset, the catalogNumber will no longer be unique on its own.

3) If there are two subcollections with their own numbering system, we advise 
to add the subcollection to the catalogNumber. That way we can differentiate 
between 123 and 123 as BRYO-123 and VASC-123. Of course, this is only necessary 
if those subcollections use their own numbering system.

Using MT-10001 & MT-VASC-10001:
+ catalogNumber is unique within the dataset
+ catalogNumber has a reasonable chance to be unique outside dataset
+ catalogNumber is closer to the way the specimen should be cited
- catalogNumber might be different from number used on specimen
- specimen database probably used 10001: if so, number needs to be concatenated 
with the collectionCode, which is a burden for some collection databases

Using 10001 & VASC-10001:
+ catalogNumber is unique within the dataset
+ specimen database probably used 10001, so catalogNumber can be used as such 
(no concatenation)
- catalogNumber might be different from number used on specimen
- catalogNumber is not unique outside dataset
- catalogNumber is not how the specimen should be cited

So what is the best option?

Original comment by peter.de...@gmail.com on 2 Feb 2012 at 9:54

GoogleCodeExporter commented 8 years ago
The TDWG DarwinCore might provide some guidance: "An identifier (preferably 
unique) for the record within the data set or collection."  This suggests we 
should use strategy 3: VASC-10001 rather than MT-VASC-10001, and include only 
an identifier for the subcollection/catalog number series in Catalog Number.  

There is perhaps a relationship in the guidance here to the use of 
occurrenceID.  If we advocate the use of occurrenceID to uniquely identify 
occurrences, proxied by unique identifiers for specimens that voucher that 
occurrence, but will likely later be needed to be grouped as duplicates, then 
we have less need to use catalogNumber alone a unique identifier for a specimen.

Original comment by mole@morris.net on 3 Feb 2012 at 8:05

GoogleCodeExporter commented 8 years ago
VASC-10001 it is. I'll update the guidelines when I find a moment.

Original comment by peter.de...@gmail.com on 10 Feb 2012 at 2:58

GoogleCodeExporter commented 8 years ago
http://code.google.com/p/applecore/wiki/CodesAndNumbers has been updated with 
new real example and info regarding institutionID, datasetID and occurrenceID.

Original comment by peter.de...@gmail.com on 22 Feb 2012 at 2:46