Closed Tasilee closed 2 weeks ago
I wouldn't have #TG3" in the definition something like
Tests of biodiversity data (represented by Darwin Core term values) that have been informed by use cases, that are widely applicable, informative in terms of fitness-for-use (or lack of it), and are readily implementable without ambiguity. Tests that are not considered CORE are one of: bdqtag:Supplementary, bdqtab:NO NOT IMPLEMENT, or bdq:Immature/Incomplete.
@ArthurChapman That's clear. One typo, s/NO NOT IMPLEMENT/DO NOT IMPLEMENT. We should also include the word mature, as immature tests can meet the other criteria of this sense of CORE.
We will now need to be explicit about UseCases, sensu the framework, and assign each CORE test to at least one UseCase.
We've got one clear UseCase as "Research analysis of biodiversity occurrence data of which organisms have been reported to occur where at what dates" We need to be similarly explicit about a UseCase, probably covering checklist data, that incorporates the latest set of CORE tests about pathway, establishmentmeans, degreeofestablisment, and frame a use case that incorporates type status validation, probably a taxonomic data use case, which would then incorporate validation of recorded by (which I think we've currently got as supplementary).
But, the text for the CORE project here states that it is: "Links to Confirmed Tests and Assertions arising out of Task Group 2." If we expand CORE beyond this, we still have to define the new use cases, in detail, likely a several year task. Best route is to keep CORE to the research uses of what organisms occurred where when use case that came out of TG2, and specify an additional category of tests that aren't core. But if they are being put forward as part of the standard they will need to have at use cases to hang them off of.
I would like to propose an alternative approach as none of us can afford years more of working on this before proposing a standard. I want to repeat my plea for simplicity and a decoupling of tests from use cases.
I see the set of tests as parallel to the Darwin Core bag of terms and and the use cases as parallel to the distinct Darwin Core Archive cores in terms of combining Darwin Core terms for a particular purpose. I think the tests should not depend on use cases. It should be the other way around. I think a use case should be a level of construct (a profile we called it before) that brings together a set of tests on a declared set of Darwin Core terms and that can declare data quality measures based on their values.
With this approach we could make the one occurrence use case based on the TG3 work as a model to show how that is done and let future work define new use cases as demand arises. These could be the stuff of task groups, and would be more tractable the less monolithic we make the standard. This is how I have thought about the BDQ work since the beginning, and ever more so now that the tribulations of coupled tests and use cases is creating a seemingly insurmountable obstacle.
This would leave us free of the otherwise somewhat arbitrary and controversial distinctions of CORE, SUPPLEMENTARY and DO NOT IMPLEMENT. The ones we finalize for the use case would become part of the standard set of tests. The rest just remain documented in GitHub (with their labels, also documented), but nowhere in the standard documentation. Having all of the rest in the standard just seems like noise to me. Put in the standard that which is mature and useful for a given purpose and leave the rest as a solid basis for future work if demand arises.
I 100% agree with @tucotuco. TG3 was a proof of concept and was never meant to be a comprehensive set of tests. I, like @tucotuco "This is how I have thought about the BDQ work since the beginning, and ever more so now that the tribulations of coupled tests and use cases is creating a seemingly insurmountable obstacle." I also have, until recently, seen the tests types as "somewhat arbitrary and controversial distinctions of CORE, SUPPLEMENTARY and DO NOT IMPLEMENT. The ones we finalize for the use case would become part of the standard set of tests." Originally treating these as basically a 1) a good test we can implement practically and that will be useful (CORE) 2). A tests that for the reasons stated in the definition are not ready to be Core tests and we only include them in the GitHub to not waste the work we've already done - not for the Standard but but may be mentioned (as a group, not individually) in the document for documentation purposes (SUPPLEMENTARY) 3). Tests that are close to being CORE but need more work because something is missing - e.g. a suitable Vocabulary (Immature/Incomplete) - note a couple may become CORE before we release the Standard if suitable Vocabularies become available and 4). Tests that for some reason we believe should not be implemented as doing so could lead to ambiguous or misleading results (DO NOT IMPLEMENT).
I know that I, and others, are getting at the limit of my capacity to continue with this work and want to see it finished - so let's keep it simple. The suggestions of @tucotuco I support and would vote to continue in that direction - not getting bogged down on things that will not go in the Standard - including both non CORE tests, and detailed Use Cases for each test. We all know Use Cases exist for the tests but to fully document them all now would take another two years of work at least (103 tests @ conservatively 2 days work per test = 206 days work!).
The problem is that we cannot describe tests within the framework without attaching them to UseCases. Fittness for purpose is a fundamental of the framework, and all tests within the framework descend from a UseCase. As long as CORE matched up with the broad use case that came out of TG3, research uses of data describing what organisms occurred where when, we could ignore this, as all of the tests we were working on hung off of that use case. The moment we expanded to consider tests outside of that scope, we are forced to define additional use cases. For supplementary tests, we can probably get away with something very skeletal that later users of those tests would replace with more clearly specified use cases. But we can't get away with this for the set of tests now in CORE.
To use the framework, we cannot decouple the tests from use cases. We must provide at least one very clear example of how a set of tests links to a use case. This will be central to the normative RDF representation of the tests.
We can think of tests as a grab bag, and the framework enables people to assemble tests from that grab bag to fit a use case. But, central to the standard must be a demonstration of how to do that. We had that with CORE when it precicely overlapped the broad use case. We don't right now.
@tucotuco is probably pointing us in the right direction of a grab bag of tests, paralell to darwin core terms that can be assembled as needed by users for their use cases. But for this to work, and for us to be able to use the framework, we have to clearly specify at least one use case and link a set of tests to that use case to show how this is done for both quality assurance and quality control.
It's interesting to see everyone's perspectives in this, I appreciate this discussion, thank you! I don't know enough of what TG3 did to comment on the use cases, but I would like to share my perspective when I was mapping the checks from OBIS with the tests here:
Concept 2: CORE: The set of mature tests that TG2 is putting forward as part of the standard. This is the bit meant by "Darwin Core terms that are widely applicable, informative, and straight forward to implement"
The use case that we did in the OBIS data quality project team:
Another thing that I think MAY be helpful is to clarify what CORE is NOT. I used to think that CORE tests are the minimal set of tests that are needed to evaluate the fitness for use of a record regardless of the use case (basically minimal set of tests that overlap any biodiversity use case), but I believe that it is not the case? (please correct me if I am wrong) For example, the newly added tests for pathway #277, #278
I don't know if these are helpful, please ignore if it doesn't. Thank you all SO MUCH for your hard work, I know time does not come cheap - I am so thankful to have the opportunity to work with you all!
Thanks @ymgan - great points and very helpful. One thing that springs to mind from your comments is that we can't document all use cases - if we followed the suggestions of @chicoreus, we would be making a random selection of a use case that certainly would not cover all cases. We currently have Examples that imply a use case, we cite where the test originated (ALA, VertNet, etc.)
@chicoreus - as said before TG3 was never meant to be comprehensive, but an exemplar or proof of concept. I attended all the early meetings of TG3 - in setting it up, and most of the meetings and discussions. It was a proof-of-concept and looking at how Use Cases could be developed and from that came the use of User Stories. Part way through, it was decided to link to the Framework and several were tested in conjunction with @allankv. TG3 was not comprehensive and was never intended to be comprehensive, and the majority of the TG2 tests were never covered by TG3. TG2 from the start was looking at a good set of tests, based on DwC and that would be "Fundamental tests of biodiversity data represented in Darwin Core terms that are widely applicable, informative, and straight forward to implement." We looked t what had been done by ALA, GBIF, iDigBio, CRIA, BISON, VertNet and others. There was never an idea of linking it directly to the Use Cases that came out of TG3. We had most of the TG2 tests prepared long before TG3 started to get any results.
For now we should just accept CORE as : "The set of mature tests that TG2 is putting forward as part of the standard. This is the bit meant by "Darwin Core terms that are widely applicable, informative, and straight forward to implement"
Perhaps, in the Document - we can have a section on adding future tests, that includes a workflow that includes documenting a Use Case, determining if it was "widely applicable, informative, and straight forward to implement" then follow the existing template, develop tests for implementation, then test implementation, etc.
Just back, briefly. I fully agree with @tucotuco and @ArthurChapman regards the circumscription of TG2 by TG3: Our tests are not bound by TG3 use cases. Our definition of core has been basically, as all have stated, "Tests that are widely applicable, informative, and straight forward to implement" with one exception: tests that we believe are 'aspirational' in encouraging a better best current practice (e.g., annotations).
I (strongly) believe that it is also informative that we define what is not CORE (out of scope of the standard) as it helps to clarify what is CORE, and document the environment to inform future uses. Thanks @ymgan for your comments. Our 'Supplementary', 'Immature/Incomplete' and 'Do not implement' are useful and are now adequately documented.
Like Arthur (as he well knows), I am also close to burnout on this work. We need to 'cut to the chase': Fill in gaps within the current CORE tests (e.g., test data - which I will do, and implementations) and get the standard document prepared.
Altered definitions of bdqtag: terms CORE, Supplementary, Immature/Incomplete, and DO NOT IMPLEMENT following recent discussions via email.
Added 5 new bdqffdq:UseCase terms for
Most of the bdq:Response contexts aren't correct, here are a set of corrections to be applied:
namespace:Term | Context | Context Should Be |
---|---|---|
bdq:ASSUMEDDEFAULT | bdq:Response | bdqTestField:Term-Actions |
bdq:CONVERTED | bdq:Response | bdqTestField:Term-Actions |
bdq:ExpectedResponse | bdq:Response | bdq:Specification |
bdq:FOUND | bdq:Response | bdqTestField:Term-Actions |
bdq:OUTOFRANGE | bdq:Response | bdqTestField:Term-Actions |
bdq:PRECISIONINSECONDS | bdq:Response | bdqTestField:Term-Actions |
bdq:PREREQUISITESNOTMET | bdq:Response | bdqTestField:Term-Actions |
bdq:PROPOSED | bdq:Response | bdqTestField:Term-Actions |
bdq:Response | bdq:Response | bdq:Response |
bdq:Response.comment | bdq:Response | bdq:Response |
bdq:Response.qualifier | bdq:Response | bdq:Response |
bdq:Response.result | bdq:Response | bdq:Response |
bdq:Response.status | bdq:Response | bdq:Response |
bdq:STANDARD | bdq:Response | bdqTestField:Term-Actions |
bdq:STANDARDIZED | bdq:Response | bdqTestField:Term-Actions |
bdq:TERRESTRIALMARINE | bdq:Response | bdqTestField:Term-Actions |
bdq:TRANSPOSED | bdq:Response | bdqTestField:Term-Actions |
bdq:CONSISTENT | bdq:Response.result | bdqTestField:Term-Actions |
I've made an export of the vocabulary markdown table into https://github.com/tdwg/bdq/blob/master/tg2/vocabularies/combined_vocabulary.csv to give us something more easily sorted to look for inconsistencies and problems, and to start setting up to add vocabulary terms into various markdown documents.
As a demonstration of linking in vocabulary terms, I've added the definitions of the use cases to the index by use case section of: https://github.com/tdwg/bdq/blob/master/tg2/core/generation/docs/core_tests.md
For the time being, the markdown table in this issue remains the authoritative copy for editing, and we expect to overwrite the csv export.
Following advice from @chicoreus, "Context" changed to bdqTestField:Term-Actions for the following terms
bdq:ASSUMEDDEFAULT bdq:CONVERTED bdq:FOUND bdq:OUTOFRANGE bdq:PRECISIONINSECONDS bdq:PREREQUISITESNOTMET bdq:PROPOSED bdq:STANDARD bdq:STANDARDIZED bdq:TERRESTRIALMARINE bdq:TRANSPOSED bdq:CONSISTENT
and Context for bdq:ExpectedResponse changed to bdq:Specification
Added new term
| bdq:AllValidationTestsRunOnSingleRecord | AllValidationTestsRunOnSingleRecord | A list of Core Validation Tests that have been run on a Single Record. | bdqffdq:InformationElements | Used in Measure of Single Record Tests |
Added new term
| bdq:AllAmendmentTestsRunOnSingleRecord | AllAmendmentTestsRunOnSingleRecord | A list of Amendments that have been run on a Single Record. | bdqffdq:InformationElements | Used in Measure of Single Record Tests |
Added new term
| bdq:assumptionOnUnknownHabitat | assumptionOnUnknownHabitat | Used when a bdq:taxonomyIsMarine source authority is unable to assert the marine or non-marine status of a taxon, the habitat (Marine/NonMarine) to assume instead or NoAssumption. | bdq:Parameter | See VALIDATION_COORDINATES_TERRESTRIALMARINE (b9c184ce-a859-410c-9d12-71a338200380). |
Changed bdq:assumptionOnUnknownHabitat to bdq:assumptionOnUnknownBiome
I saw
Is NOT_REPORTED being used by any MEASURE please? If so, it is not in the vocab
Yes @ymgan Currently only in one test #31. Also one other test that is DO NOT IMPLEMENT (#35)
Thanks @ArthurChapman ! Then I guess we need a bdq:NOT_REPORTED, it is not in the table above
We are just taking that term out of the test (#31), because it does not make sense. So that can be deleted from the document.
On Thu, 22 Aug 2024 08:32:47 -0700 Yi-Ming Gan @.***> wrote:
Thanks @ArthurChapman ! Then I guess we need a bdq:NOT_REPORTED, it is not in the table above
Turns out we don't. It is a path that can't be reached in that test, and it isn't a framework response.status value.
The Vocabulary terms in this file have been split into other files - a bdqdim vocabulary, a bdqffdq vocabulary, a bdq:directory, and a glossary and these files are being generated as csv files and markdown for the final BDQ Core Standard. They are currently generated and are in the _review folder. As such this file is no longer maintained.
Terms in the bdqffdq namespace are from the Fitness for Use Framework (Viega et al. 2017). Use the reference to the Framework Definitions for more details and examples. The use of a vocabulary term in a test specification without a namespace prefix (sometimes represented in all UPPER CASE), implies that the bdq: or bdqffdq: namespace is applicable. Note that wherever "DQ" is used in a definition it implies "Data Quality" and wherever "FFU Framework" is used it refers to the "Fitness for Use Framework" (Veiga et al. 2017).
Note: There are two tables in this issue, the first is for vocabulary for the standard, the second is for additional terms for supplement files that will go into tables in those documents rather than controlled vocabularies.
Do not edit, moved to csv files
Pending further splits, this vocabulary moved to https://github.com/tdwg/bdq/blob/master/tg2/vocabularies/combined_vocabulary.csv
Supplement: GitHub Label Terms These are terms that are outside the Standard but that have been used as either GitHub Labels or TestFields in the BDQ GitHub
Do not edit, moved to csv files
Pending further moves, this vocabulary moved to https://github.com/tdwg/bdq/blob/master/tg2/vocabularies/glossary_terms.csv