phenopackets / phenopacket-tools

An app and library for building, conversion, and validation of GA4GH Phenopackets.
http://phenopackets.org/phenopacket-tools/stable/
GNU General Public License v3.0
12 stars 5 forks source link

Ontology validator #73

Closed pnrobinson closed 1 year ago

pnrobinson commented 2 years ago

Somehow some classes got refactored that performed ontology based validation. These classes would do things that are not easy to do with JSON Schema, such as

ielis commented 1 year ago

The HpoPhenotypeValidators.Primary.phenopacketHpoPhenotypeValidator(hpo) and HpoPhenotypeValidators.Ancestry.phenopacketHpoAncestryValidator(hpo) do almost all of that.

HpoPhenotypeValidators.Primary.phenopacketHpoPhenotypeValidator(hpo)

HpoPhenotypeValidators.Ancestry.phenopacketHpoAncestryValidator(hpo)

On top of the above checks, the Ancestry validator checks if a negated term is used together with its negated parent, and suggests using the least specific term.

With regards to the ancestry checks, the term and parents are allowed only if the parent term is present and the child term is negated (e.g. Abnormality of finger but NOT Arachnodactyly).


The last item of the list above is not yet implemented:

Perhaps in the next minor release?

pnrobinson commented 1 year ago

@ielis This seems pretty easy

PhenopacketHpoCoverageValidator (Ontology hpo, Set<TermId> requiredTopLevelTerms) {
}

validate() {
// create map<TermId,Boolean> covered....
for PhenotypicFeature feature : component.getPhenotypicFeaturesList() {
   TermId tid = feature.getTermId()
   for (TermId topLevelTid : covered.keySet()) {
      if (OntologyAlgorithm.isAncestor(topLevelTid, tid)) {
          covered.put(topLevelTid, True);
}
}
// check if any term in covered is False and if so emit a warning
}

Should we just add a class like this as another example?

ielis commented 1 year ago

@pnrobinson I wrote the validator, it is at #117 . Can you please check if this is what you had in mind?

ielis commented 1 year ago

Done