Closed pnrobinson closed 5 years ago
This is working now. Need to extend the tests to have more MoIs. Also, we cannot current map semidominant inheritance
Do you have a commit for this? I was looking at this file as part of the Exomiser ingest and it threw exceptions and all sorts when trying to parse using Jackson due to empty fields not being terminated in the expected manner.
Funny you should ask: https://github.com/monarch-initiative/phenol/pull/214 I wrote a hand-crafted SAX parser for this file which seems to work. The are q few quirks that require some filtering... @iimpulse I will add the inheritance fields to phenotype.hpoa by extending hpoannotqc (the app that makes this file), so that the HPO website will display the Orphanet inheritance annotations after we update phenotype.hpoa the next time (I will need another few days to update that code, so we could do this at the next HPO web release).
Ahh - I ended up coming to the same conclusion and started with a SAX parser solution too. Never managed to finish though, so this could be a handy thing to use and save code duplication. Ah no! Even better! If you're adding this to the phenotype.hpoa
will this end-up in the phenotype_annotation.tab
file from here: https://hpo.jax.org/app/download/annotation? If so I'll just wait and the data will magically appear which will make me very happy.
Expect to see the orphanet inheritance data in phenotype_annotation.tab within the next 1-2 weeks -- I need to update hpoannotqc to use this and make sure that we only export annotations where we have phenotype data.
Fantastic! I owe you that beer you owe me.
Actually Peter, do you have any information on how we can address the Orphanet inheritance annotations for specific disease-gene associations?
I have refactored and simplified the interface. A first text now shows
robinp@ldg-jgm004:~/IdeaProjects/hpoannotqc$ grep ^ORPHA phenotype.hpoa | grep -c 'HP:0000006'
989 # autosomal dominant
$ grep ^ORPHA phenotype.hpoa | grep -c 'HP:0000007'
1265 # autosomal recessive
$ grep ^ORPHA phenotype.hpoa | grep -c 'HP:0001423'
62 # X dominant
$ grep ^ORPHA phenotype.hpoa | grep -c 'HP:0001419'
240 # X recessive
$ grep -c ^ORPHA phenotype.hpoa
54568
$ grep -c ^DECIPHER phenotype.hpoa
297
$ grep -c ^OMIM phenotype.hpoa
103616
The analysis was done via hpoannotqc. But the code would be something like this
PhenotypeDotHpoaFileWriter writer = PhenotypeDotHpoaFileWriter.factory(Ontology ont,
String smallFileDirectoryPath,
String orphaPhenotypeXMLpath,
String orphaInheritanceXMLpath,
String outpath) ....
writer.write();
@julesjacobsen @iimpulse after the new release of phenol I will upload an updated version of phenotype.hpoa to the Jenkins server that will have the Orphanet inheritance annots.
This is now integrated and working
cf: http://www.orphadata.org/data/xml/en_product9_ages.xml