bio2bel / phewascatalog

MIT License
0 stars 0 forks source link

Normalize gene symbols and MeSH terms #1

Open cthoyt opened 5 years ago

cthoyt commented 5 years ago

It's nice to have these names, but we need to normalize them and get their relevant identifiers from HGNC and MeSH. Is that available somewhere else in the database?

mauriciopl commented 5 years ago

This information is not in the database. For the genes, there is an API the would probably find the Ids (https://www.genenames.org/help/rest/), but the conversion for the mesh might be complicated. I actually remembered that the phenotype description doesn't come from mesh, but from ICD9 and the conversion to mesh would not be as straightforward. There is a paper from 1993 (From ICD9-CM to MeSH using the UMLS: a how-to guide) that states to achieve <30% success in this conversion. The current version in the UMLS website requires that I registered and I haven't checked yet. Maybe downloading the disease-ontology data and the ICD9 database would allow to come from the description to mesh id.