Closed GoogleCodeExporter closed 9 years ago
You're right, the page is not completely consistent. And I have a hard time
summarizing my thoughts about it.
1) "In combination with the collectionCode it [the catalogNumber] ideally
creates a unique identifier for each specimen record in the dataset." I think
this sentence should be changed, as we want the catalogNumber to be a unique
identifier on its own, within the dataset.
2) Adding or not adding the herbarium acronym will not disambiguate numbers
within that herbarium. MT-10001 is as unique as 10001. Of course, once you
start aggregating datasets or when their are multiple herbaria combined in one
dataset, the catalogNumber will no longer be unique on its own.
3) If there are two subcollections with their own numbering system, we advise
to add the subcollection to the catalogNumber. That way we can differentiate
between 123 and 123 as BRYO-123 and VASC-123. Of course, this is only necessary
if those subcollections use their own numbering system.
Using MT-10001 & MT-VASC-10001:
+ catalogNumber is unique within the dataset
+ catalogNumber has a reasonable chance to be unique outside dataset
+ catalogNumber is closer to the way the specimen should be cited
- catalogNumber might be different from number used on specimen
- specimen database probably used 10001: if so, number needs to be concatenated
with the collectionCode, which is a burden for some collection databases
Using 10001 & VASC-10001:
+ catalogNumber is unique within the dataset
+ specimen database probably used 10001, so catalogNumber can be used as such
(no concatenation)
- catalogNumber might be different from number used on specimen
- catalogNumber is not unique outside dataset
- catalogNumber is not how the specimen should be cited
So what is the best option?
Original comment by peter.de...@gmail.com
on 2 Feb 2012 at 9:54
The TDWG DarwinCore might provide some guidance: "An identifier (preferably
unique) for the record within the data set or collection." This suggests we
should use strategy 3: VASC-10001 rather than MT-VASC-10001, and include only
an identifier for the subcollection/catalog number series in Catalog Number.
There is perhaps a relationship in the guidance here to the use of
occurrenceID. If we advocate the use of occurrenceID to uniquely identify
occurrences, proxied by unique identifiers for specimens that voucher that
occurrence, but will likely later be needed to be grouped as duplicates, then
we have less need to use catalogNumber alone a unique identifier for a specimen.
Original comment by mole@morris.net
on 3 Feb 2012 at 8:05
VASC-10001 it is. I'll update the guidelines when I find a moment.
Original comment by peter.de...@gmail.com
on 10 Feb 2012 at 2:58
http://code.google.com/p/applecore/wiki/CodesAndNumbers has been updated with
new real example and info regarding institutionID, datasetID and occurrenceID.
Original comment by peter.de...@gmail.com
on 22 Feb 2012 at 2:46
Original issue reported on code.google.com by
mole@morris.net
on 2 Feb 2012 at 7:45