Closed matentzn closed 2 years ago
OMIM ids and their clasification are relevant in a handful of dipper ingests other than the OMIM ingest itself.
For MONDO to cover these cases we will need to replicate the function of:
https://github.com/monarch-initiative/dipper/blob/master/dipper/sources/OMIMSource.py#L10
which was my attempt to consolidate even more of that unsatisfying smear you are noting.
I think we could have a more sane separation - Mondo makes subClassOf, equivalence, synonym, labels for disease, dipper gets gene to disease and any other relevant a-box stuff (omim variant to clinvar variant, publications)
Great, I agree with you both; to move this forward (also as a blueprint for monochrom, for which we should do the same), is the correct way to produce a stand alone python ingest script based on the dipper script that I just move to mondo/monochrom repo which will simply build and OWL ontology from source? Should I continue to use dipper, or would it make sense to build something entirely stand-alone based on the dipper scripts?
https://ci.monarchinitiative.org/job/build-omim/ <- old dipper build
Running those: https://dipper.readthedocs.io/en/latest/dipper.sources.OMIM.html https://github.com/monarch-initiative/dipper/blob/master/dipper/sources/OMIM.py https://github.com/monarch-initiative/dipper/blob/master/dipper/sources/OMIMSource.py
mkdir -p mirror tmp
sh run.sh make build-omim
mondo process pulls ttl from ci.monarchinit.org
Based on this: https://docs.google.com/document/d/1pKyAZsT1ZlZxxgkNBRucvQPwsxzDF0nhGC2VaiPLj_U/edit#
OMIM Import: OMIM class IDs OMIM Phenotypic Series SubclassOf where it exists SubclassOf Mendelian disease when another subclassOf does not exist Synonyms, labels + tidying Xrefs Obsoletion - need rules where see string MOVED TO, this class should be imported as obsolete [term name] with replaced by annotation
@matentzn Not sure if/where to start here. Do you think this task is already largely completed? I know there are further issues related to the OMIM ingest, but I'm wondering if there's anything left in this issue that still needs to be completed. If there's anything left that isn't covered in another issue, could you enumerate what's left to be done here? Otherwise, maybe this issue is ready to be closed in favor of other OMIM related issues.
This is completed thanks to you :)
We have now various pipelines that produce an omim.owl, and I was wondering how to consolidate.
1) dipper, which pulls from https://data.omim.org/downloads/ 2) disease2owl which pulls from the
alpha
(aka the unvetted bleeding edge stage of the dipper data pipeline acc. to @TomConlin): https://archive.monarchinitiative.org/alpha/rdf/omim.ttl 3) mondo which pulls from http://data.monarchinitiative.org/ttl/omim.ttl, which I think is the latest official data release locationNow all of this is a bit unsatisfactory. I believe the best way to do is
@cmungall @kshefchek @TomConlin please let me know if this makes sense.