Open iDigBioBot opened 6 years ago
TestField | Value |
---|---|
GUID | 96667a0a-ae59-446a-bbb0-b7f2b0ca6cf5 |
Label | AMENDMENT_OCCURRENCESTATUS_ASSUMEDDEFAULT |
Description | Proposes an amendment of the value of dwc:occurrenceStatus to the default parameter value if dwc:occurrenceStatus, dwc:individualCount and dwc:organismQuantity are empty. |
TestType | Amendment |
Darwin Core Class | dwc:Occurrence |
Information Elements ActedUpon | dwc:occurrenceStatus |
Information Elements Consulted | dwc:individualCount |
dwc:organismQuantity | |
Expected Response | INTERNAL_PREREQUISITES_NOT_MET if dwc:occurrenceStatus is bdq:NotEmpty; FILLED_IN the value of dwc:occurrenceStatus using the bdq:defaultOccurrenceStatus Parameter value if dwc:occurrenceStatus, dwc:individualCount and dwc:organismQuantity are bdq:Empty; otherwise NOT_AMENDED |
Data Quality Dimension | Completeness |
Term-Actions | OCCURRENCESTATUS_ASSUMEDDEFAULT |
Parameter(s) | bdq:defaultOccurrenceStatus |
Source Authority | bdq:defaultOccurrenceStatus default = "present" |
Specification Last Updated | 2024-11-13 |
Examples | [dwc:occurrenceStatus="", dwc:individualCount="", dwc:organismQuantity="": Response.status=FILLED_IN, Response.result=dwc:occurrenceStatus="present", Response.comment="dwc:occurrenceStatus is bdq:Empty; assumed "Present""] |
[dwc:occurrenceStatus="X", dwc:individualCount="10", dwc:organismQuantity="": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:occurrenceStatus is bdq:NotEmpty"] | |
Source | ALA |
References | |
Example Implementations (Mechanisms) | |
Link to Specification Source Code | |
Notes | There is currently a mismatch between https://dwc.tdwg.org/terms/#dwc:occurrenceStatus recommended values and the vocabulary at bdq:sourceAuthority that we are using (https://api.gbif.org/v1/vocabularies/OccurrenceStatus/concepts) |
Comment by Paul Morris (@chicoreus) migrated from spreadsheet: Widespread assumption in vouchered occurrence data. Likely to be important when aggregating with any data with 'absent' values. However, this is an amendment: a value of "present" is being provided for dwc:occurrenceStatus when dwc:occurrenceStatus is either empty or not present.
Why could this not be incorporated under https://github.com/tdwg/bdq/issues/115 AMENDMENT_OCCURRENCESTATUS_STANDARDIZED?
That I would tend to agree with as we could easily add this special case. Comments from others?
In retrospect, if we are going to effectively treat EMPTY or an uninterpretable value as "present" then it is indeed an amendment. Sigh.
Agreed at TDWG 2018 DQIG meeting that this amendment can only be applied if the the value of dwc:occurrenceStatus is empty.
I wonder if we should change the name of this test to AMENDMENT_OCCURRENCESTATUS_ASSUMEDDEFAULT to parallel #102. Any comments? I realise the default has only the one possible value - i.e. "Present" but I am attempting to reduce the things we have to define.
After reviewing all, I'd agree. Changed accordingly
Inconsistency snuck in somewhere in the editing history, this is now clearly labeled as an amendment, but retains an output type of Notification, changing this to Amendment for consistency.
Changed "AMENDED" to "FILLED_IN" in accordance with discussions April 16. I also moved the INTERNAL_PREREQUISITES_NOT_MET test into the FILLED_IN part as this aligns with similar amendments.
Edited Example 2 as there is no "INTERNAL_PREREQUISITES_NOT_MET". There was an error in the test data, now fixed.
[dwc:occurrenceStatus="X": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:occurrenceStatus is not EMPTY"]
Hey, out of curiousity, may I know why the amendment of occurrenceStatus (to default = "present") does not consider the value of individualCount or organismQuantity and organismQuantityType?
I am thinking that it is possible to have situation where occurrenceStatus is empty but individualCount is >0 individualCount = 0. Please see:
Thank you!!
Edit: sorry, I noticed that I made a mistake in this comment
@ymgan I think you are absolutely right. The two terms individualCount and organismQuantity should be taken into account.
Surely it doesn't matter if dwc:occurrenceStatus is EMPTY and dwc:individualCount or dwc:organismQuantity or dwc:organismQuantityType have values? dwc:occurrenceStatus will still be set to "present', which is correct.
It does, because if there is already something in the field then you'd not do anything (interestingly if the dwc:occurrenceStatus says "absent" and you have something in the other fields, then there is a problem!)
This is possibly a test that needs revisiting and expanding - because if there is other stuff in that field then it probably needs to be AMENDED - e.g. if it has a count (5) then it probably should be changed to "present" etc. or do we have another test for that? - if we don't, perhaps we should - or modify this one.
On Sun, 26 Feb 2023 16:21:20 -0800 Arthur Chapman @.***> wrote:
This is possibly a test that needs revisiting and expanding - because if there is other stuff in that field then it probably needs to be AMENDED - e.g. if it has a count (5) then it probably should be changed to "present" etc. or do we have another test for that? - if we don't, perhaps we should - or modify this one.
Also need a validation to compare occurrenceStatus with dwc:individualCount and dwc:organismQuantity, and a separate amendment to amend occurrenceStatus from dwc:individualCount and dwc:organismQuantity.
Not sure to what extent AMENDMENT_OCCURRENCESTATUS_ASSUMEDDEFAULT should examine other fields, our usual pattern for assumeddefault does not entail other fields.
@chicoreus - the only reason here for the other fields as if is there nothing in those fields you cannot default to "present" because in that case it could be "absent"
Thank you everyone! Apology as I realized I made a mistake in the comment which is now corrected. What I was referring to was scenario like this:
individualCount | occurrenceStatus | inferred occurrenceStatus | flag |
---|---|---|---|
0 | NULL | ABSENT | OCCURRENCE_STATUS_INFERRED_FROM_INDIVIDUAL_COUNT |
It is from the comment in https://github.com/gbif/pipelines/issues/268#issuecomment-624755278 Under such condition, this test at its current state will amend occurrenceStatus to present, which is perhaps undesirable.
@Tasilee This looks like an issue that we need to put on agenda. As in @chicoreus comment - do we need new tests and from @ymgan comment - this one may not work as is.
Also need a validation to compare occurrenceStatus with dwc:individualCount and dwc:organismQuantity, and a separate amendment to amend occurrenceStatus from dwc:individualCount and dwc:organismQuantity. Not sure to what extent AMENDMENT_OCCURRENCESTATUS_ASSUMEDDEFAULT should examine other fields, our usual pattern for assumeddefault does not entail other fields.
I don't think an amendment can be made without considering the consistency of the rest of the terms that affect the assertion of absence of detection. The test as it stands is basically for the one case where all of those other fields are EMPTY.
Following up on @tucotuco comment - we have to reword the ER and add more Information Elements to either
On Sat, 11 Mar 2023 15:32:08 -0800 Arthur Chapman @.***> wrote:
- more complicated to take into account one or other of dwc:individualCount or dwc:organismQuantity having values - if both are EMPTY, or either has a value >0 then dwc:occurenceStatus is amended to "present", but if either dwc:individualCount or dwc:organismQuantity has a value of "0" then dwc:occurenceStatus is amended to "absent" BUT what if one of those = 0 and one is +ve ? would that make an INTERNAL_PREREQUISITES_NOT_MET?
This is phrasing a new and different amendment. Assumed default takes an empty value and stamps a default value into it. The most complex this can be and retain its intent is to check if occurrence status, individual count and organism status are all empty, and if so propose an amendment to occurrence status of present.
Any more complex logic, and we are taliking about a test in the form amendment_occurrencestatus_fromquantity or something like that.
I agree with @chicoreus about qualifying the current test to check if both dwc:individualCount or dwc:organismQuantity are EMPTY. From @tucotuco's comment https://github.com/tdwg/bdq/issues/75#issuecomment-1464802733, I assume we have agreement?
Agreed - this simplifies it. @ymgan does this satisfy your issues?
Amended the ER to
FILLED_IN the value of dwc:occurrenceStatus using the Parameter value if dwc:occurrence.Status, dwc:individualCount and dwc:organismQuantity are EMPTY; otherwise NOT_AMENDED
Agreed - this simplifies it. @ymgan does this satisfy your issues?
Yes, thank you very much for your hard work here! I really appreciate it!
I have updated the Description and the Examples accordingly and will amend the test data.
I have added dwc:individualCount and dwc:organismQuantity to the Information Elements.
Restructured Parameter(s) and Source authority
Change sourceAuthority from "dwc:occurrenceStatus = "present"" to "dwc:occurrenceStatus default = "present""
Changed all Information Elements to "ActedUpon" as per Paul's Java Code.
@chicoreus: You will need to amend your code to include dwc:individualCount and dwc:organismQuantity ?
The parameter can't be the same as an information element.
Propose changing the parameter from dwc:occurrenceStatus to bdq:defaultOccurrenceStatus
Thanks @chicoreus - that seems a reasonable solution to me. Amending.
Changed Expected Response from
FILLED_IN the value of dwc:occurrenceStatus using the Parameter value if dwc:occurrence.Status, dwc:individualCount and dwc:organismQuantity are EMPTY; otherwise NOT_AMENDED
to
FILLED_IN the value of dwc:occurrenceStatus using the Parameter value if dwc:occurrenceStatus, dwc:individualCount and dwc:organismQuantity are EMPTY; otherwise NOT_AMENDED
May I know if we need a VALIDATION_ORGANISMQUANTITY_NOTEMPTY please? We already have
Thanks @ymgan - #232 is Supplementary at this stage and another test for VALIDATION_ORGANISMQUANTITY_NOTEMPTY could be valuable for some, but at this stage we don't think it is widely applicable. But it is certainly one worth considering in the future if required. There are quite a few tests in a similar position that we don't believe are CORE.
Thanks @ArthurChapman !! Good morning :D To make sure that I understand, even if this test is core and its prerequisite include individualCount and organismQuantity are empty, it does not mean we need the notempty tests for individualCount and organismQuantity. Am I correct?
On Mon, 19 Aug 2024 07:10:09 -0700 Yi-Ming Gan @.***> wrote:
Thanks @ArthurChapman !! Good morning :D To make sure that I understand, even if this test is core and its prerequisite include individualCount and organismQuantity are empty, it does not mean we need the notempty tests for individualCount and organismQuantity. Am I correct?
Non-empty tests for individualCount and organismQuantity would ba aspirational at this point.
If we adopted them we would be asserting that these terms would be important enough for everyone to try to put in the effort to populate them. For natural science collections data at least, this would be non-trivial, collections may know how many parts they have for some specimen, but not be readily able to work out how many individuals those represent.
So, yes, these do make natural related tests, but not really within the scope of what we want to accomplish. Others, for whom quality in this portion of the data, can easily propose a use case and suite of tests.
got it, thanks @chicoreus !
This and #102 are "AssumedDefault" tests for which non-empty values aren't preventing execution of the test, that should probably both have the internal prerequisites clause removed and be able to reach the NOT_AMENDED clause:
FILLED_IN the value of dwc:occurrenceStatus using the Parameter value if dwc:occurrenceStatus, dwc:individualCount and dwc:organismQuantity are bdq:Empty; otherwise NOT_AMENDED
I agree with removing the INTERNAL_PREREQUISITES_NOT_MET phrase on this Test.
Do we change default to "Present" for now as "present" won't currently validate against the GBIF vocabulary?
On Mon, 11 Nov 2024 18:48:40 -0800 Lee Belbin @.***> wrote:
Do we change default to "Present" for now as "present" won't currently validate against the GBIF vocabulary?
Yes. Good catch. Probably worth adding a note as well.
Added to Notes " There is currently a mismatch between https://dwc.tdwg.org/terms/#dwc:occurrenceStatus recommended values and the vocabulary at bdq:sourceAuthority that we are using (https://api.gbif.org/v1/vocabularies/OccurrenceStatus/concepts)"
Corrected the parameter namespace for bdq:defaultOccurrenceStatus from dwc to bdq.