YuLab-SMU / DOSE

:mask: Disease Ontology Semantic and Enrichment analysis
https://yulab-smu.top/biomedical-knowledge-mining-book/
117 stars 36 forks source link

information content calculation discordance #27

Open ieiwk opened 5 years ago

ieiwk commented 5 years ago

Prerequisites

Describe you issue

Ask in right place

To use the latest DO database, I tried to update the .DOSEEnv[[ 'DOIC']]. I first tried to see if I can correctly calculate ICs by myself, using relations among terms given by .DOSEEnv[[ 'dotbl']]. However I found that my calculations differed from attr( .DOSEEnv[[ 'DOIC']], 'IC'). I did some digging, and finally it seemed that attr( .DOSEEnv[[ 'DOIC']], 'IC') was computed by the function computeIC. I noticed that in computeIC p = cnt / sum( docount). But if I understood correctly, should the sum( docount) be length( docount) if one is to calculate IC? This question also goes for calculations of cnt.