Open lhmarsden opened 1 year ago
Thanks for opening this @lhmarsden I'll ping the OBIS group to comment
One other thing to consider- if we make this change to EMoF should we also make the same change to MoF?
Haven't thought it through, just thinking out loud, would this be a workable solution for this request https://github.com/tdwg/dwc/issues/362 ?
Haven't thought it through, just thinking out loud, would this be a workable solution for this request tdwg/dwc#362 ?
This request [tdwg/dwc#362] has passed public review and is being prepared for an Executive decision.
This sounds like a generalization of occurrenceID
in ExtendedMeasurementOrFact, if we add this then maybe occurrenceID
should be retired or deprecated? I agree with @albenson-usgs regarding https://github.com/tdwg/dwc/issues/362 but I'm not sure how to reconcile the two proposals.
If there are doubts aI urge you to jump in and question https://github.com/tdwg/dwc/issues/362 before it goes to the ratification process. In the Unified Model, we are proposing to allow Assertions on anything be declaring both the type of thing the Assertion is about (which "table") and the key for the record for that type (the equivalent of relatedResourceID).
AntOBIS supports this proposal.
We have some use cases. For example, stomach content of a predator in an Occurrence is assessed to determine the fraction of the predator diet that a prey type made up (by weight). Having this term in emof will allow us to establish predator-prey relationship in an easier manner by having the occurrenceID of the prey as relatedResourceID
for the Measurement of the predator. So, we might still need occurrenceID for emof here (I think), unless the dataset has to be published as Occurrence core.
edit: after talking to @pieterprovoost, the relationship probably should be established at Occurrence level (e.g. associatedTaxa or associatedOccurrence)
dwc:ResourceRelationship dwc:resourceID is the subject, and dwc:relatedResourceID is the object
Would the resource in a measurement (or fact) be the subject or the object? Would the eMoF document something the resource is doing or something done to the resource?
I have been thinking of the occurrenceID resource of the eMoF as the subject of the measurement and thus better replaced by adding the resourceID term to the eMoF extension?
For AntOBIS example:
So
resourceID
/occurrenceID
is the predator's occurrence because it is the measurement of its stomach content. relatedResourceID
is the prey's occurrence. I think it is nicer to specify it here than putting a list of prey occurrences under predator's occurrence (associatedOccurrence
). And of course, alternatively, we can use resource relationship extension.
I hope our example makes sense?
Hi,
Do you know how long it is likely to be before I can use (if accepted) resourceID in the emof extension? I have some data to publish, and am wondering if I should proceed with a resourceRelationship extension instead.
Thanks!
@ymgan Would you mind writing out an example, because I'm not clear on how the predator/prey problem this relates to this proposal. This is how I interpret the current proposal:
subject | predicate | object | |
---|---|---|---|
ResourceRelationship | resourceID | relationshipOfResourceID | relatedResourceID |
eMoF | occurrenceID | measurementTypeID | measurementValueID |
eMoF change proposal | resourceID | measurementTypeID | measurementValueID |
I think your interpretation of the proposal is correct, @pieterprovoost. A way of recording measurements related to a materialSample or any other resource.
occurrenceID | scientificName | associatedOccurrences |
---|---|---|
occ_001 | Pachyptila belcheri | "predator of" : ["occ_002", "occ_003"] |
occ_002 | Crustacea | |
occ_003 | Euphausia vallentini |
occurrenceID | relatedResourceID | measurementType | measurementValue |
---|---|---|---|
occ_001 | occ_002 | fraction diet by prey items based on regurgitate content | 0.997 |
occ_001 | occ_003 | fraction diet by prey items based on regurgitate content | 0.002 |
It is the measurement of the stomach content of the predator (occ_001), so I think the eMoF records should point to occ_001. Without the relatedResourceID
, the information of the prey established based on stomach content of the bird is lost unless I use the resourceRelationship extension.
occurrenceID | measurementType | measurementValue |
---|---|---|
occ_001 | fraction diet by prey items based on regurgitate content | 0.997 |
occ_001 | fraction diet by prey items based on regurgitate content | 0.002 |
That is how I look at it, but please correct me if my understanding is wrong.
Edit: looking at this after thinking a little more based on Guillaume's comment:
occurrenceID | scientificName | basisOfRecord | preparations | associatedOccurrences |
---|---|---|---|---|
occ_001 | Pachyptila belcheri | HumanObservation | "predator of" : ["occ_002", "occ_003"] | |
occ_002 | Crustacea | MaterialSample | regurgitate content | |
occ_003 | Euphausia vallentini | MaterialSample | regurgitate content |
occurrenceID | relatedResourceID | measurementType | measurementValue |
---|---|---|---|
occ_002 | occ_001 | fraction diet based on regurgitate content | 0.997 |
occ_003 | occ_001 | fraction diet based on regurgitate content | 0.002 |
@lhmarsden Replacing occurrenceID
with resourceID
has considerable impact on our indexing and is not something that can be achieved in the short term. What we could do is add resourceID
, keep occurrenceID
for now, and keep indexing as we do now taking only into account occurrenceID
.
I appreciate that replacing occurrenceID with resourceID would be a big change. Adding resourceID as an extra would be a suitable short term solution in my opinion.
Luke
From: Pieter Provoost @.> Sent: Wednesday, May 3, 2023 2:56:38 PM To: gbif/rs.gbif.org @.> Cc: Luke Marsden @.>; Mention @.> Subject: Re: [gbif/rs.gbif.org] Measurements for materials and more (Issue #103)
@lhmarsdenhttps://github.com/lhmarsden Replacing occurrenceID with resourceID has considerable impact on our indexing and is not something that can be achieved in the short term. What we could do is add resourceID, keep occurrenceID for now, and keep indexing as we do now taking only into account occurrenceID.
— Reply to this email directly, view it on GitHubhttps://github.com/gbif/rs.gbif.org/issues/103#issuecomment-1532983566, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AOMNBFOXEZGXUFN76KKIBKDXEJIYNANCNFSM6AAAAAAWNLF7HA. You are receiving this because you were mentioned.Message ID: @.***>
Hi,
occurrence
occurrenceID scientificName associatedOccurrences occ_001 Pachyptila belcheri "predator of" : ["occ_002", "occ_003"] occ_002 Crustacea
occ_003 Euphausia vallentinieMoF
occurrenceID relatedResourceID measurementType measurementValue occ_001 occ_002 fraction diet by prey items based on regurgitate content 0.997 occ_001 occ_003 fraction diet by prey items based on regurgitate content 0.002
It is the measurement of the stomach content of the predator (occ_001), so I think the eMoF records should point to occ_001. Without the
relatedResourceID
, the information of the prey established based on stomach content of the bird is lost unless I use the resourceRelationship extension. occurrenceID measurementType measurementValue occ_001 fraction diet by prey items based on regurgitate content 0.997 occ_001 fraction diet by prey items based on regurgitate content 0.002That is how I look at it, but please correct me if my understanding is wrong.
We actually proposed another way to deal with such exemple. We had similar issues while identifying pathogens within another species. (Applying Darwin core data standard to wildlife disease – advancements toward a new data model). See also #413.
Using this parentOccurenceID terms it would results in
occurrenceID | parentOccurenceID | scientificName | basisOfRecord | preparation |
---|---|---|---|---|
occ_001 | Pachyptila belcheri | human observation | ||
occ_002 | occ_001 | Crustacea | material sample | regurgigate content |
occ_003 | occ_001 | Euphausia vallentini | material sample | regurgigate content |
measurementID | occurrenceID | measurementType | measurementValue |
---|---|---|---|
mea_001 | occ_002 | fraction diet | 0.997 |
mea_002 | occ_003 | fraction diet | 0.002 |
That seems to work!! Thank you very much for taking your time to write this down @guillaumebody !! I appreciate it!
You're welcome @ymgan , but please, indicate that this is a relevant solution for your situation in #413. The parentOccurenceID is currently not an accepted term of DwC.
@tucotuco this situation plaid to have "parentAssertionID" aside to "relatedAssertionID" concept in the new GBIF model
The extendedMeasurementOrFacts extension is a very useful way to record measurements or facts related to an occurrence or event in a standardised, potentially machine-readable way.
However, one might have measurements or facts related to a range of different things. For example, I work with many biologists who take measurements of materials or samples they are logging in a Material Sample Extension.
In this pull request, I am suggesting that the relatedResourceID term is added to the extendedMeasurementOrFact extension - after some discussion with @dagendresen. https://github.com/gbif/rs.gbif.org/pull/102
One could use this to record measurements related not only to material samples, but anything else, without the need of a resourceRelationship extension.