opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
Apache License 2.0
12 stars 2 forks source link

Update UK Biobank to EFO mappings #2116

Closed d0choa closed 2 years ago

d0choa commented 3 years ago

Since the last time when UKBB GWAS studies were mapped to EFO, the ontology has suffered significant changes. The most notable the dynamic imports of some other ontologies (MONDO, HP, etc.). Due to this changes, more terms are available sometimes better reflecting the GWAS trait than the original mapping. For example, the trait Bursitis (SAIGE_726_3) was mapped to frozen shoulder (EFO_1000941), when there is a bursitis term available in EFO (MONDO_0002471).

Next a subset (top20) of studies that were automatically suggested as potential improvements:

|study_id    |traitName                                                |currentEfo     |currentName                                               |candidateId  |candidateName                             |
|SAIGE_112   |Candidiasis                                              |EFO_1001283    |Candidiasis, Invasive                                     |MONDO_0002026|candidiasis                               |
|SAIGE_716   |Other arthropathies                                      |EFO_1000999    |joint disease                                             |HP_0003040   |Arthropathy                               |
|SAIGE_726_3 |Bursitis                                                 |EFO_1000941    |frozen shoulder                                           |MONDO_0002471|bursitis                                  |
|SAIGE_216   |Benign neoplasm of skin                                  |EFO_0004198    |skin neoplasm                                             |MONDO_0021440|benign neoplasm of skin                   |
|SAIGE_252_1 |Hyperparathyroidism                                      |EFO_0009189    |Hyperthyroidism                                           |EFO_0008506  |hyperparathyroidism                       |
|SAIGE_215   |Other benign neoplasm of connective and other soft tissue|EFO_1000541    |Soft Tissue Neoplasm                                      |MONDO_0000654|benign connective and soft tissue neoplasm|
|SAIGE_427_11|Paroxysmal supraventricular tachycardia                  |EFO_0009493    |paroxysmal tachycardia                                    |HP_0004763   |Paroxysmal supraventricular tachycardia   |
|SAIGE_313   |Pervasive developmental disorders                        |EFO_0003759    |pervasive developmental disorder - not otherwise specified|MONDO_0000594|pervasive developmental disorder          |
|SAIGE_296_22|Major depressive disorder                                |EFO_0003761    |unipolar depression                                       |MONDO_0002009|major depressive disorder                 |
|SAIGE_172   |Skin cancer                                              |EFO_0009259    |skin carcinoma                                            |MONDO_0002898|skin cancer                               |
|SAIGE_415_21|Primary pulmonary hypertension                           |Orphanet_275766|Idiopathic pulmonary arterial hypertension                |MONDO_0001999|primary pulmonary hypertension            |
|SAIGE_705_8 |Hyperhidrosis                                            |EFO_1000712    |hypohidrosis                                              |HP_0000975   |Hyperhidrosis                             |
|SAIGE_523_32|Chronic periodontitis                                    |EFO_0000649    |periodontitis                                             |EFO_0006343  |chronic periodontitis                     |
|SAIGE_510   |Other diseases of lung                                   |EFO_0000684    |respiratory system disease                                |EFO_0003818  |lung disease                              |
|SAIGE_253_3 |Diabetes insipidus                                       |HP_0000863     |Central diabetes insipidus                                |MONDO_0004782|diabetes insipidus                        |
|SAIGE_208   |Benign neoplasm of colon                                 |EFO_0001378    |multiple myeloma                                          |MONDO_0002278|benign colon neoplasm                     |
|SAIGE_481   |Influenza                                                |EFO_0001669    |null                                                      |EFO_0007328  |influenza                                 |
|SAIGE_473_1 |Chronic laryngitis                                       |MONDO_0002647  |laryngitis                                                |MONDO_0001369|chronic laryngitis                        |
|SAIGE_528_5 |Diseases of lips                                         |EFO_1001047    |mouth disease                                             |MONDO_0004748|lip disease                               |
|SAIGE_592_2 |Urethritis and urethral syndrome                         |EFO_0003878    |urethritis                                                |MONDO_0001730|urethral syndrome                         |
only showing top 20 rows
d0choa commented 3 years ago

I made a pass on a subset of trait-mappings from SAIGE or NEALE that could be updated.

I propose to include every row in the next spreadsheet with either the current or the alternative EFO as specified by booleans (checkbox)

d0choa commented 2 years ago

This work was completed in 22.09