Open matentzn opened 2 years ago
Another alternative: https://github.com/pyobo/pyobo/blob/main/src/pyobo/sources/umls/umls.py
@hrshdhgd can you try this between breaths at some point? This is not super urgent, but I keep putting it off and we will need it quite suddenly in late September, where I would like to avoid big surprises..
For now, we need one thing more urgently then the other:
This should not be a major endeavour, but if we can make some progress towards it slowly slowly, that would be great.
I have perl to turn the TSVs from medgen/NCBI disease subset into serviceable obo format. It's not pretty but AFAICR it works. It's in the old ingest repo
https://github.com/cmungall/diseases2owl/tree/master/sources/medgen
This is essentially the disease subset of UMLS plus medgen pseudoCUIs
Yes, I agree with @cmungall . I've been using his work here: https://github.com/monarch-initiative/medgen
Some of it has been tweaked but largely it works great as is.
UMLS terms come from that in addition to Medgen terms. I'm not 100% certain, but I think it will include 100% of them.
@matentzn If you like we can rename it the umls-medgen
ingest.
I did not realise this at all. In any case, we need a separate UMLS ingest, because I also need to extract mappings to and from HPO and another ontologies that may not fall under "disease" - but I can reduce the scope to extracting only the mappings if @cmungall you agree we should be using the medgen ingest to align with UMLS, rather than constructing a separate UMLS ingest.
@hrshdhgd this is only partially related to your efforts, because what we need right now is some of the UMLS mappings in SSSOM format for our mapping efforts.
https://github.com/monarch-initiative/umls-ingest ... work in progress.
@hrshdhgd Since there's some overlap in what we're doing I'm guess we should browse through each other's code; though for medgen ingest, most of it is Chris's.
Noting this https://documentation.uts.nlm.nih.gov/automating-downloads.html
an automated way to obtain UMLS mappings through the API.