Closed afuchs1 closed 8 years ago
couldn't find previous documentation - maybe I missed it??? @nielsklazenga - is there something somewhere?
how about something like this using JSON format? (only included three of the fields in the example, but any of the identification group could be added?) ... (note: JSON format used in DWC for dynamicProperties)
[ { "identificationID": "47FDEB45-1949-4501-8CA3-F74CE478F7D5", "verbatimIdentification": "Acacia aneura", "identifiedBy": "John Smith" }, { "identificationID": "0890D9A4-1F04-4869-B4A7-AB1A4448C1E1", "verbatimIdentification": "Acacia sp.", "identifiedBy": "Dave Simpson" } ]
previousIdentifications
is a plain text string, not a JSON array (I fixed the JSON above, by the way). Using JSON defeats the purpose of having previousIdentifications
, as you'll just be replicating the Identification History
row source. We have been using previousIdentifications
in AVH for more than three years, so it shouldn't be that hard to get an example.
Not so easy to find nice complete examples in our database actually. Note the examples above were not delivered as previousIdentifications
, but Identification History
. I just concatenate them all together, as ALA can't handle Identification History.
so do we have a set order for fields? Which fields are included (all from identification history)? What happens when missing values? these appear to be silently dropped? must make parsing a pain.... :-(
my understanding of this field is that it does duplicate the previousIdentifications when that has to be provided as a single field.... personally I would like to see something more strong delimited that using ?csv-type format - assume if you have comma's in any field then it would need to be enclosed in quotes...
This is what I do with AVH data:
private function previousIdentifications($unit) {
$dets = array();
$date = array();
$list = $unit->getElementsByTagName('Identification');
if ($list->length > 1) { // This skips all Units that have a single Identification
// (which is assumed to be the current identification).
foreach ($list as $item) {
$preferredflag = $item->getElementsByTagName('PreferredFlag');
if ($preferredflag->length>0 && in_array($preferredflag->item(0)->nodeValue, array('0', 'FALSE', 'false'))) {
// There is a preferred flag and it resolves to FALSE.
$det = array();
$nlist = $item->getElementsByTagName('FullScientificNameString');
$det['FullScientificNameString'] = ($nlist->length > 0) ? $nlist->item(0)->nodeValue : FALSE;
$nlist = $item->getElementsByTagName('IdentificationQualifier');
if ($nlist->length > 0) {
$det['IdentificationQualifier'] = $nlist->item(0)->nodeValue;
$det['IdentificationQualifierInsertionPoint'] = $nlist->item(0)->getAttribute('insertionpoint');
}
else {
$det['IdentificationQualifier'] = FALSE;
$det['IdentificationQualifierInsertionPoint'] = FALSE;
}
$nlist = $item->getElementsByTagName('NameAddendum');
$det['NameAddendum'] = ($nlist->length > 0) ? $nlist->item(0)->nodeValue : FALSE;
$nlist = $item->getElementsByTagName('HybridFlag');
if ($nlist->length > 0) {
$det['HybridFlag'] = $nlist->item(0)->nodeValue;
$det['HybridFlagInsertionPoint'] = $nlist->item(0)->getAttribute('insertionpoint');
}
else {
$det['HybridFlag'] = FALSE;
$det['HybridFlagInsertionPoint'] = FALSE;
}
$nlist = $item->getElementsByTagName('IdentifierRole');
$det['IdentifierRole'] = ($nlist->length > 0) ? $nlist->item(0)->nodeValue : FALSE;
$nlist = $item->getElementsByTagName('IdentifiersText');
$det['IdentifiersText'] = ($nlist->length > 0) ? $nlist->item(0)->nodeValue : FALSE;
$nlist = $item->getElementsByTagName('ISODateTimeBegin');
$det['IdentificationDate'] = ($nlist->length > 0) ? $nlist->item(0)->nodeValue : FALSE;
$date[] = ($nlist->length > 0) ? $nlist->item(0)->nodeValue : 'ZZZZ';
$nlist = $item->getElementsByTagName('Notes');
$det['IdentificationNotes'] = ($nlist->length > 0) ? $nlist->item(0)->nodeValue : FALSE;
$dets[] = $det;
}
}
// previous identifications are sorted by identification date
array_multisort($date, SORT_ASC, $dets);
$previousDets = array();
foreach ($dets as $index => $det) {
$prev = '';
// Scientific name
$sciname = $det['FullScientificNameString'];
$scinameBits = explode(' ', $sciname);
if ($det['HybridFlag'] && $det['HybridFlagInsertionPoint']) {
$scinameBits = explode(' ', $sciname);
$scinameBits[$det['HybridFlagInsertionPoint']-1] = Encoding::toUTF8('×') . $scinameBits[$det['HybridFlagInsertionPoint']-1];
$sciname = implode(' ', $scinameBits);
}
if ($det['IdentificationQualifier'] && $det['IdentificationQualifierInsertionPoint']) {
$scinameBits = explode(' ', $sciname);
if ($det['IdentificationQualifierInsertionPoint'] > count($scinameBits))
$det['IdentificationQualifierInsertionPoint'] = count($scinameBits);
$spacer = ($det['IdentificationQualifier'] == '?') ? '' : ' ';
$scinameBits[$det['IdentificationQualifierInsertionPoint']-1] = $det['IdentificationQualifier'] . $spacer . $scinameBits[$det['IdentificationQualifierInsertionPoint']-1];
$sciname = implode(' ', $scinameBits);
}
if ($det['NameAddendum']) $sciname .= ' ' . $det['NameAddendum'];
$prev .= $sciname;
// Determiner
if ($det['IdentifiersText']) {
$prev .= ', ';
$prev .= ($det['IdentifierRole'] == 'conf.') ? 'conf. ' : 'det. ';
$identifiers = explode(';', $det['IdentifiersText']);
$identifier = explode(',', $identifiers[0]);
$prev .= (count($identifier) > 1) ? trim($identifier[1]) . ' ' . trim($identifier[0]) : trim($identifier[0]);
if (count($identifiers) == 2) {
$identifier = explode(',', $identifiers[1]);
$prev .= ' & ';
$prev .= (count($identifier) > 1) ? trim($identifier[1]) . ' ' . trim($identifier[0]) : trim($identifier[0]);
}
elseif (count($identifiers) > 2)
$prev .= ' et al.';
}
// Determination date
if ($det['IdentificationDate']) {
$dateBits = explode('-', $det['IdentificationDate']);
$date = '';
$day = (isset($dateBits[2])) ? $dateBits[2] : FALSE;
$month = FALSE;
if (isset($dateBits[1])) {
switch ($dateBits[1]) {
case '01':
$month = 'i';
break;
case '02':
$month = 'ii';
break;
case '03':
$month = 'iii';
break;
case '04':
$month = 'iv';
break;
case '05':
$month = 'v';
break;
case '06':
$month = 'vi';
break;
case '07':
$month = 'vii';
break;
case '08':
$month = 'viii';
break;
case '09':
$month = 'ix';
break;
case '10':
$month = 'x';
break;
case '11':
$month = 'xi';
break;
case '12':
$month = 'xii';
break;
default:
break;
}
}
$year = $dateBits[0];
if ($day)
$date = "$day.$month.$year";
elseif ($month)
$date = "$month.$year";
else
$date = $year;
$prev .= ', ' . $date;
}
if ($det['IdentificationNotes']) {
$prev .= ' (' . $det['IdentificationNotes'] . ')';
}
$previousDets[] = $prev;
}
$previousDets = implode('; ', $previousDets);
if (substr($previousDets, strlen($previousDets)-1, 1) != '.')
$previousDets .= '.';
$ret = array (
'column' => 'previousIdentifications',
'value' => $previousDets,
);
return $ret;
}
}
This is, of course, only used when multiple Identifications
are provided and previousIdentifications
itself is not provided (or empty). When I asked HISCOM at the time, the only feedback I got was from Alison, and we went for readability.
previousIdentifications
may have similar semantics to Identification history
, but it does not duplicate its syntax or implementation. The content of this element is not meant to be parsed or easily parseable. I think it is not so much has to be as can only be provided in a single field and that has to do with the capabilities of the provider, not those of the consumer, and then you cannot make any requirements as to syntax.
ok - then the definition needs to clearly state this is for human consumption, but I think we should still provide a recommended order of fields and records (following your code above).
We could always use a "identificationHistory" field for a structured concatenation when we want the data to provided for easy machine parsing. Do we add provision for this now?
Yes, we've got the Identification class.
You are confounding definition with implementation. previousIdentifications
is used when the information is delivered as a (unparseable) string. If you deliver the information as individual Identifications with properties, you deliver Identification History. It doesn't matter to the standard whether this is in a separate file or all concatenated into a single field.
Since we pretty much all deliver normalised Identifications to AVH (if we deliver more than the current Identification in the first place), we can agree on a format for previousIdentifications
for display in AVH (which we did years ago), but that is AVH, not HISPID.
true!
Example required for output of multiple identifications.