Open gaurav opened 6 years ago
- Mark phyloreference as complete [state: Unvalidated]
- Start reasoning to test resolution. State will either remain unvalidated or change to Validated.
Isn't there a difference between a phyloreference that hasn't been validated yet ("unvalidated"), and one that has been but failed to validate?
Or do you anticipate that Phyloreferences cannot be in "unvalidated" state because they haven't been validated yet?
That is what I anticipated, but I think you're right: it'd be useful to highlight phyloreferences that validated incorrectly (and which we might want to treat as a "to do" that we would like to fix eventually) and those that have never been validated at all -- especially initially, when validation might be very slow. I've updated the issue!
Remember that a phyloreference could remain in "Unvalidated" even after testing, for example if the authors did not annotate where they expected a clade definition to resolve on any phylogeny. That seems unlikely, but I'm sure it'll turn up.
Note that [state: Failed Validation] is necessary to the Clade Ontology test suite -- it allows us to mark phyloreferences that we don't expect to currently resolve, either because of technical limitations (such as phyloref/clade-ontology#27) or because of some ambiguity or error in the phyloreference statement itself. In the Clade Ontology, this is being tracked as phyloref/clade-ontology#31.
We could use the Publication Status Ontology to give each phyloreference a publication status.
Since this uses a time-indexed value with context, we could also document the full history of a particular phyloreference changing over time and which agent was responsible for each change in status! One potential downside is that this may be confusing with the publication status of a single PHYX file or the entire Clade Ontology, but given that it is the Phyloreference has will be pso:with status, I think it'll be okay.
We could instead use the Evaluation and Report Language to describe the result of testing these phyloreferences by various means (human curation, automated testing, etc.) as passing or failing. However, I think the Publication Status Ontology is just about perfect for our needs!
Could be relevant for someone maintaining an ontology of clade definitions, but isn't unique to phylogenetic definitions. You could imagine having different source files in different dictionaries representing different states. So, not a priority for Klados v1.0.
For future reference, we could use the curation status specification that is part of the IAO rather than using the Publication Status Ontology.
The phyloreference curation process should probably look like this:
This is the basis of colour-coding phyloreferences: incomplete (clear?) and validated (green) phyloreferences are clear, while unvalidated can be further subdivided into "unvalidated but all specifiers match" (yellow) and "unvalidated but some specifiers don't match" (red).
Furthermore: