Closed Tasilee closed 9 months ago
TestField | Value |
---|---|
GUID | e17918fc-25ca-4a3a-828b-4502432b98c4 |
Label | VALIDATION_MODIFIED_NOTEMPTY |
Description | Is there a value in dcterms:modified? |
TestType | Validation |
Darwin Core Class | dcterms |
Information Elements ActedUpon | dcterms:modified |
Information Elements Consulted | |
Expected Response | COMPLIANT if dcterms:modified is bdq:NotEmpty; otherwise NOT_COMPLIANT |
Data Quality Dimension | Completeness |
Term-Actions | DCTERMSMODIFIED_NOTEMPTY |
Parameter(s) | |
Source Authority | |
Specification Last Updated | 2024-01-29 |
Examples | [dcterms:modified="2022-01-02": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dcterms:modified is bdq:NotEmpty"] |
[dcterms:modified="[null]": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dcterms:modified is bdq:Empty"] | |
Source | TG2 |
References | |
Example Implementations (Mechanisms) | |
Link to Specification Source Code | |
Notes | This bdq:Supplementary test is not regarded as CORE (cf. bdq:CORE) because of one or more of the reasons: not being widely applicable; not informative; not straightforward to implement or likely to return a high percentage of either bdq:COMPLIANT or bdq:NOT_COMPLIANT results (cf bdq:Response.result). A Supplementary test may be implemented as CORE when a suitable use case exists. See Issue comments below. |
This feels like it should be core, and we should take a position that dcterms:modified should always contain a value.
Interesting - are people using it? It should be automatically generated. Is this one, that if CORE, would have a near 100% failure?
@ArthurChapman I would expect this is a value that aggregators really want to have populated so that they can tell that they need to update their aggregated records from changed data in the source without having to examine all the provided values against their stored values, if modified is newer, then update, if modified is not newer, then the aggregator may either trust that assertion or compare the records. For complex chains where data is being combined from more than one aggregation source, modified is an indication of which record to use when the same record is provided from more than one path...
I've asked @ArthurChapman to generate some definitions of "Supplementary" and "Do not implement" because we need to be clear on the differences. For example, on this test, @chicoreus states that it is 'aspirational' because the benefits of entries can be appreciated, even though it will be rarely populated. But, @chicoreus says that #233 should be 'do not implement' because the field will be largely unpopulated, and one presumes, 'not aspirational', or 'complex/impossible to implement'.
Hence, we need a clear statement (Vocabulary at least) on what "Supplementary" and "Do not implement" mean, and I agree with Arthur that reasoning should be added to the Notes to make it clear why we tag the test as such.
Interesting - are people using it? It should be automatically generated. Is this one, that if CORE, would have a near 100% failure?
545407660 out of 2232326955 Occurrence records (24.4%) from data publishers aggregated in GBIF (2023-8-01) have a populated dcterms:modified field. 10 data publishers account for 14.6% of that 24.4%.
I don't think it is realistic for most data providers to provide a useful dcterms:modified value and I think aggregators are fine with this. It is easier to run all data in a dataset through a pipeline if that dataset gets a new version. In fact, it is necessary, since all of the taxonomy rectification that happens is a source of modification for the aggregated record, and is uncoupled from the dataset. It pretty much has to be done.
So, I think it is a myth that the aggregators will benefit from this field. Whatever SHOULD be the case, it is not realistic.
In the light of @tucotuco comment., I suggest that we do not include this test. Either remove "Supplementary" tag and close issue, or tag as "DO NOT IMPLEMENT.
I would leave it as Supplementary. There is nothing particularly difficult or controversial about its implementation, which is what I think the DO NOT IMPLEMENT label is meant to signify. It's just that it isn't a particularly useful test on a global scale. This could be different for a specific use case.
Changed title/label to be consistent with other tests of dcterms:modified #272, #273, #274