Open muttcg opened 3 years ago
The preferred way to share these values will be through the MeasurementOrFact extension, and this should be implemented, and then the VertNet way (extract from dynamicProperties
) routed to it. (There are around 9 million occurrences with a MeasurementOrFact extension.)
We will need a vocabulary and parser for mof:measurementType
(to begin with, with at least the values required for VertNet), mof:measurementUnit
(grams and metres? kilograms and metres? g and mm? Whichever, we'll need to handle a wide range of decimal values), mof:measurementValue
(interpreted according to the unit).
If the MeasurementOrFact
extension is not present, we can look into dwc:dynamicProperties
for data.
This is a larger task. We'll need an additional extension in DWCA downloads, and a way to specify query parameters/predicates using the extension (e.g. "parameter":"MEASUREMENT_OR_FACT:MEASUREMENT_TYPE"
or ...search?MeasurementOrFact:MeasurementType=LENGTH
, TBD).
We need to decide what to do with values we can't parse (e.g. we can convert "5 inches" to cm/mm even if we don't know what it's measuring, and unknown measurement types probably still deserve to be shown on an occurrence page).
PR #477 already adds some support, including retrieving values from dynamicProperties
and routing them to a MeasurementOrFact
extension, but we need to decide on the API and general technical approach to interpreting additional extensions before implementing any more than this:
- length and hasLength (pasre DwcTerm.dynamicProperties → convert values to MeasurementAndFacts extension → add MeasurementAndFacts (3 fields: DwcTerm.measurementType, DwcTerm.measurementValue, DwcTerm.measurmentUnit) array into index/Avro):
A length query (non-API) can then be done with MeasurmentAndFacts.measurementType IN("total length", "head-body length", "fork length", "standard length", "snout-vent length") AND MeasurmentAndFacts.measurementValue == QUERY VALUE
A hasLength query is the same, but without the value.
- mass and hasWeight (pasre DwcTerm.dynamicProperties -> convert values to MeasurementAndFacts extension -> add MeasurementAndFacts (3 fields: DwcTerm.measurementType, DwcTerm.measurementvalue, DwcTerm.measurementUnit) array into index/Avro):
The query is similar, but uses the type "total weight".
Querying using these parameters would get quite complicated:
gbifid | measurementType | measurementValue | measurementUnit |
---|---|---|---|
1 | TotalLength | 1.4 | metres |
1 | LegLength | 0.4 | metres |
2 | TotalLength | 1.0 | metres |
2 | LegLength | 0.4 | metres |
It's not obvious how to query for things with TotalLength>1.2m, LegLength<0.6m etc. MeasurementOrFact:MeasurementType=LENGTH
doesn't work, MeasurementOrFact:TotalLength=1.2,
might.
If the
MeasurementOrFact
extension is not present, we can look intodwc:dynamicProperties
for data.
I think you need to look into dwc:dynamicProperties (and occurrenceRemarks, and fieldNotes) even if there is a measurementOfFact extension. The measurementOrFacts extension might be included for other reasons than pulling out the kind of measurements we parsed for VertNet.
It may be of interest that the parser implemented for VertNet has been expanded greatly to extract a much broader set of trait data under the FuTRES project. Though VertNet does not have those capabilities in its production code base, it may be of great interest to pursue these broader trait extraction capabilities.
Thanks, @tucotuco @MattBlissett
Actually, the current version I made uses both MeasurementOrFact
extension and dwc:dynamicProperties
, dwc:dynamicProperties
parses and reroutes to MeasurementOrFact
extension, this part is done, but I don't interpret values in MeasurementOrFact
extension after, just put raw values as a first step
I'm looking though a broad set of issues around the extraction of traits. What is the current status of accessing records with extracted traits in data downloads, snapshots, search API or gbif.org?
They are shown verbatim on occurrence records only. Search can be done for records having measurementsOrFacts They aren't included in downloads at the moment.
Thanks @timrobertson100 .
As part of the VertNet feature we need to interpret fields:
And add them into index and hdfs schemas
Use VertNet feature branch