xjsachs / applecore

Automatically exported from code.google.com/p/applecore
0 stars 0 forks source link

Duplicates at - URGENT #44

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Please describe your question as clearly as possible. Include links if
possible.

I now know two herbaria (MT, TRTE) who are recording in which herbaria 
duplicates/replicates of the specimen can be found. There are probably other 
herbaria doing the same. The field lists the herbarium acronyms concatenated 
and separated:

duplicatesAt: NFLD; UWO; DAO

Questions:

1. Is this useful information to share in Darwin Core? For regular users, 
FilteredPush users?

2. How can we share this in Darwin Core? No term seems appropriate.

The closest one might be 
http://rs.tdwg.org/dwc/terms/index.htm#associatedOccurrences, but that should 
list the IDs for the associated specimens, not the collections where they are 
deposited. Is there a way we can interpret this definition in a broader sense?

The other option is of course 
http://rs.tdwg.org/dwc/terms/index.htm#dynamicProperties. Since the key value 
pairs in this term are generally separated by ";", just dumping duplicatesAt 
(containing ";") will probably cause problems. Are any of these useable?

A. duplicatesAt=NLFD,UWO,DAO; otherDynamicProperty=...

B. duplicateAt=NLFD; duplicateAt=UWO; duplicateAt=DAO; otherDynamicProperty=...

Any advise would be welcome. Based on the feedback here, TRTE will either 
publish or not publish this information before Thursday, March 29th, after 
which it will be difficult to change it.

Original issue reported on code.google.com by peter.de...@gmail.com on 26 Mar 2012 at 1:16

GoogleCodeExporter commented 8 years ago
Hi, sorry for a year of silence.

BRIT is using associatedOccurences for this.  If we were strictly following the 
rules I think we would represent it this way:
Duplicate:sheet:BRIT:BRIT24235; Duplicate:carpological:Duplicate:sheet:NY; 
Duplicate:sheet:MO.

I don't think Dynamic properties is the place to put this.  I also don't think 
many herbaria bother to record this.  When they do, a series of intended places 
for duplicate distribution often goes on the label but that's no guarantee all 
those herbaria actually did get a duplicate.  Might still be sitting in a box 
for future distribution!

I think the more likely future need for this will be as a later curatorial 
recording of duplicates at other institutions--that were finally located by FP.
A

Original comment by amanda.n...@gmail.com on 28 Mar 2012 at 7:25

GoogleCodeExporter commented 8 years ago
This raises an issue as to whether TDWG DarwinCore is complete enough for 
botanical purposes.  associatedOccurences really highlights the issue.  The 
members of a duplicate set come from only one occurence and should share the 
same occuranceID - they aren't associated occurences, they are the same 
occurance (and much of the time, the same biological individual).  However, the 
grouping of the duplicate sets is, to some varying extent, a matter of 
inference.  One potential use of a duplicatesAt property would be to present 
data from a herbarium sheet about where it asserts duplicates should be found.  
Another potential use of a duplicatesAt property would be to assert a full list 
of places where duplicates are believed to be found, by all means of inference. 
  We do capture some assertions about where duplicates are believed to be found 
at Harvard, but we currently aren't doing it as structured data.  

Our normal guidance in other cases of concatenated lists of multiple values in 
a single property has been to use a pipe character as a separator.  

A duplicatesAt property containing a pipe separated list of herbarium acronyms 
representing herbaria where duplicates are believed to have been sent seems a 
reasonable option that could be used in flat darwin core.  

In FilteredPush, we will likely be making assertions about duplicate 
relationships between particular specimen records, so transport of known places 
where duplicates are expected to occur could be of value in helping to make 
those assertions.

I think we need to propose a collectionObjectID/specimenID/voucherID term to 
hold a GUID for the database record for a particular specimen as an addition to 
TDWG DarwinCore.

I don't think we should use associatedOccurences in this case, as we are 
talking about multiple vouchers of the same occurence, not associated other 
occurences.  

Either approach A or approach B looks like it should work, with A supporting 
flat darwin core, and B not, thus perhaps favoring A.

Original comment by mole@morris.net on 29 Mar 2012 at 1:34