load disease symptom/phenotype data from #101

Mitodb is a mitochondrial disease database described in this paper:

This site does not appear to have been updated since 2015. There is no structured download, but an email address exists that we could email. Alternatively, the relevant data looks like it could be scraped without too much effort.

There are ~100 diseases with data (for example: Diseases are normalized to OMIM. No obvious normalization of phenotypes/symptoms.

Bot code is here: Code to scrape mitodb and match hpo phenotypes:

Looking at one randomly chosen disease ( with 26 phenotypes, 24 of them match 18 match an HPO label exactly, 2 match a synonym exactly, and 1 more matches a "related synonym".

Over everything:

186 out of 524 unique symptoms are missing accounting for 437 out of 2155 links most used missing phenotypes [('hypotonia', 44), ('lactate accumulation', 37), ('hyperactive reflexes', 17), ('increased blood ck', 16), ('basal ganglia pathology', 15), ('increased blood transaminase', 14), ("babinski's sign", 13), ('psychiatric symptom', 11), ('hepatopathy', 10), ('skin pigmentation changes', 8)]

Phenotypes missing, along with how they're used in MitoDB

Pubmed id disease
3-hydroxyglutaric aciduria 21347589 231530
5-oxoprolinuria 11445798 266130
abnormal brain stem function 18571143 612233
acute chest syndrome 10861320 603903
air trapping 17099020 219700
alar grooves 23470839 302950
anterior stromal dystrophy 1446772 163950
aortic valvular insufficiency 22829427 219100
asphyxia 1327585 608782
babinski's sign 15300460;21595125;24814845;2137962;18571143;21... 277460;610246;610246;551500;612233;208920;2712...
basal ganglia pathology 22683713;8392291;21266382;18972346;19135620;20... 614739;220111;220111;168600;256000;551500;2787...
bleeding 17435591;9159540 277900;130000
bossed forehead 21266382;22829427;18470892 220111;219100;230740
brain stem pathology 8392291;21266382 220111;220111
brainstem haemangioblastoma 8929948 193300
bronchial hyperactivity 11045837 604901
bruising 16990350;9159540 163950;130000
bulbar dysfunction 17296839 105400
candidiasis 8185357 300400
cardiac conduction defect 17439982;10480210;10208957;18256394;2858640;10... 530000;312750;312750;176670;535000;115200;251880
cardiac fibrosis 10580070 115200
cardiac structural defects 18000968;15558842;11045837 219000;300000;604901
central nuclei in myotubes 17676042 255200
cerebellar haemangioblastoma 8929948 193300
cerebellar syndrome 20855850 250950
ceruloplasmin levels low 17435591 277900
cervical spine abnormality 23470839 302950
chilblain-like lesions 15883328 225750
chondrodysplasia punctata 23470839;7541833 302950;214100
circumoral cyanosis 18256394 176670
claudicatio 3652487;2012127 264800;264800
cleft lip or palate 10405446;23222957;18000968;8826887;15558842;15... 242840;242840;219000;105650;300000;154500
congenital blistering 12668616 173650
corpus callosum thining 21911699 610532
cortical dysgenesis 10405446 242840
cox negative fibres 23352259 615156
cox-negative muscle fibers 17620490 609286
csf elevated ifn-alpha 24300241 225750
csf elevated neopterin 24300241 225750
dark urine 7110809;22147108 305900;203500
death in the first months of life 17486094 612075
decreased brain naa to cr 14586596 277700
decreased gsh levels 11445798 266130
decreased recovery of rna synthesis after dna damage 27004399 216400
decreased serum bicarbonate 8120710 201450
decreased serum igf-1 16510863 502000
delayed closure of anterior fontanel 21932319 614008
diabetes mellitus type 1 18581092;20972738 520000;222300
diabetic retinopathy 18581092 520000
downward sloping palpebral apertures 1446772;15340364 163950;154500
dysplastic sphenoid wing 9128932 162200
ear canal atresia 15340364 154500
ear malformation 18000968;15558842;15340364 219000;300000;154500
ear pigmentation 22147108 203500
early death 17668387;20693550 245400;245400
emmetropia 16876521 101200
exacerbation during febrile episodes 18571143 612233
eye abnormalities 16316432 176270
eye pigmentation 22147108 203500
facial wrinkles 7484729 502000
fatty heart 2658591 231680
finger deformity 16961075 270550
fracture 6496237;22585395 502000;176270
gastric ulcers 20142534 203700
graying of hair 11054058;7484729;16673358 127550;502000;277700
haematuria 11167767;11167767 306700;277480
hearing loss due to cochlear malformation 23033317 613398
hepatocellular adenomas 22678084 232200
hepatopathy 20370797;23222957;18695062;16909392;15781812;1... 231680;242840;256810;256810;254780;203700;2037...
high serum igm 24136356 615513
hyperactive reflexes 15955954;21595125;24814845;8602753;24532200;18... 168600;610246;610246;256000;312080;612233;2722...
hypermobile joints 8553383;18470892;2195034;16801035;17932957 154700;230740;300624;130000;160150
hypersensitivity to dna cross-linking 17426088 227650
hypoplastic or absent clavicles 21932319 614008
hypospadia 20335238;15558842 614052;300000
hypotonia 17977044;2658591;21266382;18819985;23352259;20... 231680;231680;220111;609560;615156;614052;2560...
incontinence 12557294 611105
increased blood afp 19696032;1377828 606002;208900
increased blood bilirubin 17435591 277900
increased blood ck 20370797;18819985;8652023;23313956;19722047;77... 231680;609560;310200;615084;540000;201475;3002...
increased blood gammaglutamyltransferase 8273986;11045837 232200;604901
increased blood ldh 8120710 201450
increased blood transaminase 17977044;23362058;8273986;17435591;8120710;164... 231680;608594;232200;277900;201450;257200;2604...
increased fetal hemoglobin 8759887 260400
increased igm 26271390 251260
increased plasma hga 12501223 203500
increased urin collagen n-telopeptide 12501223 203500
increased urin hga 12501223 203500
increased urinary copper 17435591 277900
intertriginous freckling 9128932 162200
intracranial haemorrhage 21114482;11167767;11167767 306700;306700;277480
iridonesis 8553383 154700
iris transillumination 8553383 154700
ischaemic heart disease 21663647;6496237 143890;502000
joint bleeding 11167767;11886463;11167767 306700;306700;277480
joint effusion 12501223 203500
kidney calcifications 8273986 232200
kidney cysts 8929948;19825331;7541833 193300;173900;214100
kidney failure 19825331 173900
kidney stone 19825331;22147108 173900;203500
lactate accumulation 22683713;21266382;18819985;10852545;20335238;8... 614739;220111;609560;603041;614052;256000;2123...
laryngotracheoesophageal anomalies 15558842 300000
larynx atresia 18000968 219000
late visual disorientation 15534185 104300
leucoplakia 11054058;12668616 127550;173650
limb dyspraxia 15534185 104300
limb wasting 17296839 105400
long qt 18256394;16055927 176670;610198
low plasma carnitine 6646897;9003853 201450;609016
low serum igg 24136356 615513
low serum trypsinogen 10393609 260400
low-set umbilicus 18000968 219000
lumbar pain 22147108 203500
lung abscesses 17099020 219700
meningo-encephalocele 18000968 219000
methylsuccinic aciduria 18054510 201470
midfacial hypoplasia 21266382;23470839;15885794 220111;302950;123500
mosaic perfusion 17099020;18403663 219700;244400
mucous plugging 17099020;18403663 219700;244400
multiple sclerosis 7735876;11523562 535000;535000
muscle haematoma 11167767;11167767 306700;277480
narrowed thorax 10393609 260400
neuronal migration defect 7541833 214100
nk-cell reduction 22105576 300400
no head control 18571143 612233
non-progressive ventriculomegaly 16691624 101200
non-walking 14649546 312750
nose malformation 18000968 219000
oesophageal stricture 11054058 127550
orthodontic problem 9733026 100800
pacemakers implanted 10580070;10939567 115200;181350
pallesthesia 21595125;12557294;23065789 610246;611105;607259
pancreatic carcinoma 8929948 193300
peau d`orange lesions 2012127 264800
peribronchial thickening 17099020;8633137;18403663 219700;613490;244400
perioral melanocytic macules 20126809 175200
persistent foramen ovale 23222957 242840
phaeochromocytoma 8929948 193300
poor healing 9159540 130000
post-operative bleeding 11167767;11167767 306700;277480
post-partum bleeding 11167767 277480
prominent corneal nerves 1446772 163950
protruding auricles 18470892 230740
protruding lips 18470892 230740
pseudoanodontia 18470892 230740
psychiatric symptom 17620490;23352259;15955954;23313956;11329229;1... 609286;615156;168600;615084;520000;271245;5020...
pulmonary hypertension 14985486 603903
pyrosis 11247006 312750
radiosensitivity 1256588;10799436 208900;251260
ragged red fibres 23352259 615156
reflux 23470839;22585395 302950;176270
retinal angioma 8929948 193300
retinal cherry red spot 16434659 257200
retinal myelinated fibers 16961075 270550
septum pellucidum defect 16691624 101200
serum elevated ifn-alpha 24300241 225750
skin pigmentation changes 11054058;21932319;18256394;17426088;2917181;20... 127550;614008;176670;227650;227650;227650;2109...
skin redundancy 18470892 230740
skull ossification defect 18000968 219000
sleeping with eyes open 18256394 176670
spinal haemangioblastoma 8929948 193300
spontaneous bacterial peritonitis 11045837 604901
steatorrhoea 10679949;10946006;10393609;10829988 200100;200100;260400;219700
succinate accumulation 19465911 252011
t-cell reduction 22105576 300400
teeth pigmentation 22147108 203500
thin limbs 21932319;16673358 614008;277700
thorax deformity 21932319 614008
thumb abnormalities 8826887;24635570;17426088 105650;268400;227650
tonsillectomy 9733026 100800
tracheostoma 15340364 154500
tricuspid valvular insufficiency 22829427 219100
tubulopathy 17439982;7581370;19138848;17442627 530000;557000;612075;607426
ureter agenesis 18000968 219000
variceal bleeding 11045837 604901
ventricular arrythmia 16847078 302060
vermis hypoplasia 10405446 242840
visual acuity decrease 15953402;20301646;6496237;22748208;15795345 277460;258501;502000;611726;203800
wheelchair use 21855841 607694
white matter atrophy 21911699 610532
widened atrophic scars 9159540 130000
windswept deformity 8725795 177170
xanthomas 21663647 143890
xerophthalmus 18256394;21480477 176670;216400
yellowish papules 3652487 264800
zygomatic hypoplasia 15340364 154500