Closed frehburg closed 1 year ago
P.S.: We conducted validation using your phenopacket-tools
, you may do this in our project by calling validate
on the cmd as well. We are aware of the following errors and will tend to them.
2023-09-22 15:01:31 | ERROR | ERKER2Phenopackets.src.utils.PhenopacketValidation:_validate_phenopacket:92 - BaseValidator required 'phenotypicFeatures[0].type.label' is missing but it is required
2023-09-22 15:01:31 | ERROR | ERKER2Phenopackets.src.utils.PhenopacketValidation:_validate_phenopacket:92 - BaseValidator required 'interpretations[0].id' is missing but it is required
2023-09-22 15:01:31 | ERROR | ERKER2Phenopackets.src.utils.PhenopacketValidation:_validate_phenopacket:92 - BaseValidator required 'interpretations[0].diagnosis.disease' is missing but it is required
2023-09-22 15:01:31 | ERROR | ERKER2Phenopackets.src.utils.PhenopacketValidation:_validate_phenopacket:92 - MetaDataValidator Ontology Not In MetaData No ontology corresponding to ID 'NCBITaxon:9606' found in MetaData
2023-09-22 15:01:31 | ERROR | ERKER2Phenopackets.src.utils.PhenopacketValidation:_validate_phenopacket:92 - MetaDataValidator Ontology Not In MetaData No ontology corresponding to ID 'GENO:0000135' found in MetaData
2023-09-22 15:01:31 | ERROR | ERKER2Phenopackets.src.utils.PhenopacketValidation:_validate_phenopacket:92 - MetaDataValidator Ontology Not In MetaData No ontology corresponding to ID 'ORPHA:71529' found in MetaData
Hi @frehburg , the phenopackets in ERKER2Phenopackets/data/out/phenopackets/2023-09-22-1343
indeed have the errors that you mention above.
Regarding the missing resource for certain ontologies (last 3 errors) - it should be relatively straightforward to address the error by inserting the resources into the resources list.
Regarding missing phenotypic feature label - it is unclear to me where you're getting the term IDs from. From what I can follow in the code, it is coming from row[label_col] and can be None
. However, as you know, the label is a required field of OntologyClass
, so you have to get it. One way to get the label is to query JAX's ontology API at ontology.jax.org. For instance, using /api/hp/terms/{id}
endpoint, you can get a following JSON:
curl -X 'GET' \
'https://ontology.jax.org/api/hp/terms/HP%3A0001250' \
-H 'accept: application/json'
returns
{
"id": "HP:0001250",
"name": "Seizure",
"definition": "A seizure is an intermittent abnormality of nervous system physiology characterised by a transient occurrence of signs and/or symptoms due to abnormal excessive or synchronous neuronal activity in the brain.",
"comment": "A type of electrographic seizure has been proposed in neonates which does not have a clinical correlate, it is electrographic only. The term epilepsy is not used to describe recurrent febrile seizures. Epilepsy presumably reflects an abnormally reduced seizure threshold.",
"synonyms": [
"Epileptic seizure",
"Seizures",
"Epilepsy"
],
... even more content in here ...
}
where name
is the term's label.
Alternatively, you can use hpo-toolkit to get the label without network access using get_term method:
import hpotk
# Choose a HPO version and stick to it in the analysis
hpo_url = 'https://github.com/obophenotype/human-phenotype-ontology/releases/download/v2023-09-01/hp.json'
hpo = hpotk.load_minimal_ontology(hpo_url)
print(f'Loaded HPO v{hpo.version}')
term = hpo.get_term('HP:0001250')
if term is not None:
# there indeed is a term for `HP:0001250`, so we can access term's `name` property.
print(term.name) # prints 'Seizure'
Regarding the other issues, I think you should be able to put some meaningful values there. However, please let me know if you run into any troubles..
Dear @ielis, Thank you for your detailed reply. I have been on a conference all of this week, that's why things have been moving slowly. I will look into your points Monday. Sounds like they should be fixable.
Thank you!
Filip
Dear @ielis,
Could you please check the new structure of our phenopackets? We put disease as OntologyClass into Diagnosis instead of having Phenopacket>Disease.
You can find the code here:
ERKER2Phenopackets/src/MC4R/MapMC4R.py
If there is something else we should change, please leave us a comment here.
Cheers,
Filip and Adam