neurogenomics / RareDiseasePrioritisation

Prioritise cell-type-specific gene targets from the Rare Disease Celltyping project.
1 stars 0 forks source link

Repeat symptom-level enrichment analyses using EWCE #25

Closed bschilder closed 7 months ago

bschilder commented 1 year ago

Previously did this using a series of many millions of simple gene enrichment tests. #18 But @NathanSkene noted that EWCE's lower gene limit (min of 4 genes in each gene list) is arbitrary and could be lifted. Given the large number of tests, I will need to reduce the number of bootstrap iterations 100k to ~1k (will do some time estimations to make sure that can be run in a reasonable amount of time).

bschilder commented 7 months ago

Because the vast majority of diseases only have 1 gene annotated to them in the HPO (7671/8631 diseases = 88.9%), neither EWCE nor phenomix would be very useful for finding enrichment. Instead, I've taken the following strategy:

(please forgive the screenshot, copy-and-paste doesn't bring over the latex equations)

Image

This is symptom-cell type linking procedure is now automated as follows:

results <- MSTExplorer::load_example_results()
results <- MSTExplorer::add_symptom_results(results = results)

I've summarised this in a figure, which is currently designated as Supplementary Fig S1:

Image