Closed zxenia closed 4 years ago
@zxenia
too many commits to review each individual one.
this is a possibly a bug: https://github.com/IHEC/ihec-ecosystems/blob/validate_ontology/IHEC_Data_Hub/ontology_lookup.py#L54-L61
i don't see what stops current_ontology == rule_ontology
from being None == None
? this will evaluate to true and not do any validation.
https://github.com/IHEC/ihec-ecosystems/blob/validate_ontology/IHEC_Data_Hub/ontology_lookup.py#L41
can you change it so it does not return a partially filled hash?
can you separate the logging from functionality for each function:
def log(self, x):
logging.getLogger()....
then self.log(msg)
, so the api user can control it.
i would have written this as a service instead of per term, but i can wrap it as one, so your call.
Hi @sitag ,
rule_ontology
can't be None
.
The method relies on the current metadata spec:
it takes ontology_type
which is a key in schema (e.g. 'sample_ontology_curie'
) , then the values
of 'ontology_type'
are set in ontology_rules
- those are taken from the metadata spec.
can you change it so it does not return a partially filled hash?
Could you give me an example here, I can't reproduce it.
Currently, it is expected that the curie has been validated by json schema - we have the regex
pattern check for curie.
So it takes a valid curie in the form of e.g. uberon:0013540
, splits it and returns dict
{'ontology_name': 'uberon', 'curie': '0013540'}
@zxenia But you never check that parameter 'ontology_type' to the function is actually in the hash; if I use this as a library and pass something wrong in there then that error should not quietly pass. It would make it very hard to debug. You need to raise an error if it's not in the hash, instead of quietly using None. In general, KeyError
s are meaningful, and using .get
hides them.
Hi @sitag, I added a logger method for OntologyLookup, you can mute the logs and pass your error messages.
All logs from ontology_lookup and validateOntologies() can be muted by setting:
logging.getLogger('ontology_lookup').disabled = True
logging.getLogger('validateOntologies').disabled = True
or setting the logs level
logging.getLogger('ontology_lookup').setLevel(logging.ERROR)
@zxenia looks okay to me
This PR addresses the issue of ontology validation.
The goals are:
to validate that only six accepted ontologies used
to check that all used ontology terms are valid terms
Five out of six accepted ontologies are validated via Ontology Lookup API service
One - NCI Metathesaurus - doesn't provide an API service. Therefore there is no validation of NCI terms for now.
Ontology validation takes a CURIE as an input, despite the schema has not been updated yet (PR #37 is open). It doesn't validate CURIE format, this I think should be done on schema spec level (e.g. a regex pattern).
Note: Current branch is a feature branch from validator_reports