tdwg / dwc

Darwin Core standard for sharing of information about biological diversity.
https://dwc.tdwg.org
Creative Commons Attribution 4.0 International
203 stars 70 forks source link

New Term - parentMeasurementID #362

Closed guillaumebody closed 1 year ago

guillaumebody commented 3 years ago

New term : parentMeasurementID

Proposed attributes of the new term:

tucotuco commented 3 years ago

This looks like a valuable generic way to extend MeasurementOrFacts.

The definition suggests ("group this and potentially other Measurements or fact" that the term might be used in ways than use case described in the Efficacy Justification (measurements of measurements). Do you envision other uses? And can you give examples?

2021-07-27 I retract the following opinion based on this commentary. - JRW

I am a bit concerned about the note. In implementation in Darwin Core Archives, the basisOfRecord term is only usable in Occurrence Core records, and has a recommended vocabulary. It does not seem as if there is a viable way to use basisOfRecord here, however, "statistical estimation" might be plausible as a part of the vocabulary used in dwc:measurementType. I say, "part of" because it would not be sufficient on its own, it would have to be "statistical estimation of something".

dr-shorthair commented 3 years ago

Are measurements that share a common parent all siblings?

guillaumebody commented 3 years ago

This looks like a valuable generic way to extend MeasurementOrFacts.

The definition suggests ("group this and potentially other Measurements or fact" that the term might be used in ways than use case described in the Efficacy Justification (measurements of measurements). Do you envision other uses? And can you give examples?

I am a bit concerned about the note. In implementation in Darwin Core Archives, the basisOfRecord term is only usable in Occurrence Core records, and has a recommended vocabulary. It does not seem as if there is a viable way to use basisOfRecord here, however, "statistical estimation" might be plausible as a part of the vocabulary used in dwc:measurementType. I say, "part of" because it would not be sufficient on its own, it would have to be "statistical estimation of something".

Are measurements that share a common parent all siblings ?

This terms would allow to record siblings measurement of a parent one. For instance, one could record in a roe deer density estimation Event 1: the area of the study Occurrence 1: the species and the time period, and the basisOfRecord "statistical estimation" Measurement 1: measurementType = density ; measurementValue : 15 ; measurementUnit : individual per kilometer square Measurement 1-1 : measurementType = standard deviation ; measurementValue : 3.2 ; measurementUnit: individual per kilometer square Measurement 1-2 : measurementType = distribution ; measurementValue : gaussian Measurement 1-3 : measurementType = confidence interval ; measurementValue: 9|21 ; measurementUnit: individual per kilometer square Measurement 1-3-1 : measurementType = confidence level ; measurementValue : 95 ; measurementUnit : percentage

Measurement 1-1 ; 1-2 ; 1-3 are indeed sibling and describe the parent one, the density estimation per se. If you remove the measurement introduced by this new term, you get the current possibility of the Darwin Core.

The definition is very similar to the definition of parentEventID, and the use is indeed similar, except that it applies to measurement or fact instead of Event. In this dataset of density estimation, no human, nor machine has directly observed a roe deer. Those observartion would be found in the raw data dataset. Here, the "presence" of roe deer in only due to a statistical software running. It is even clearer if you think about a dataset based on "probability of presence", such as results of habitat suitability statistical procedure. It also allows to differenciate "expert knowledge" of density, which is "human observation" from statistical estimation, without changing the measurement Value: "density".

tucotuco commented 3 years ago

Thank you for this example @guillaumebody. Now that I see better what you are trying to do I retract my comment. The Occurrence records in the Occurrence extension can each bear a basisOfRecord, so the remaining issue would be to create a new class term proposal for something like StatisticalEstimation to accompany the existing types of Occurrence types (PreservedSpecimen, LivingSpecimen, FossilSpecimen, MachineObservation, HumanObservation, MaterialCitation).

albenson-usgs commented 1 year ago

The OBIS Secretariat and nodes have reviewed the proposal and while we do not have an immediate use case to apply it to, we can see it being a valuable addition to the MoF extensions. If ratified as a new term, OBIS will ensure it's added to the extended measurement or fact extension.

pieterprovoost commented 1 year ago

Hi all, I would like to bring to your attention https://github.com/gbif/rs.gbif.org/issues/103 which proposes to add dwc:relatedResourceID (or rather dwc:resourceID) to the ExtendedMeasurementOrFact extension. As @albenson-usgs pointed out, adding this term to the MeasurementOrFact extension would probably address the parent measurement issue discussed here as well.

guillaumebody commented 1 year ago

Hi all, I would like to bring to your attention gbif/rs.gbif.org#103 which proposes to add dwc:relatedResourceID (or rather dwc:resourceID) to the ExtendedMeasurementOrFact extension. As @albenson-usgs pointed out, adding this term to the MeasurementOrFact extension would probably address the parent measurement issue discussed here as well.

Hi Pieter, This term would indeed technicaly do the job. In my view, yet, there is a clear difference between "relatedID", and "parentID".

The parentID (either Event, Occurence, Measurement, ...) is a clear indication of nested records, a "within" term. Through relatedResourceID, you can link very different information that share very different relationships. Merging both will univetably end up with confusion.

For instance, you could have estimations of population density throught 2 methods: method 1 giving 10 95IC 8-12 and method 2 giving 12 95IC 9-15.

MeasurementID parentMeasurementID relatedResourceID measurementType measurementValue measurementUnit
uuid_1 uuid_2 density 10 individual per kilometer square
uuid_11 uuid_1 x_0.025 8 individual per kilometer square
uuid_12 uuid_1 x_0.975 12 individual per kilometer square
uuid_2 uuid_1 density 12 individual per kilometer square
uuid_21 uuid_2 x_0.025 9 individual per kilometer square
uuid_22 uuid_2 x_0.975 15 individual per kilometer square

if needed, you can add a crossed relatedResourceID between uuid_1 and uuid_2 to indicate that they are the estimation of the same element, or a relatedResourceID to the graphique of probability density of each estimation without mixing it with the structuration of your data. Of course, a generic parentResourceID would work well in addition to a generic relatedRessourceID.