neurogenomics / RareDiseasePrioritisation

Prioritise cell-type-specific gene targets from the Rare Disease Celltyping project.
1 stars 0 forks source link

Filter by frequency #34

Closed bschilder closed 10 months ago

bschilder commented 11 months ago

Frequency of disease is a key factor in viability currently (until N=1 legislation is passed)

Frequency types

From the annotations provided by HPO, we currently have:

phenotype-disease frequency

gene-phenotype frequency

phenotype frequency

We do not currently have data on absolute phenotype frequency in the general population. Will need to gather.

disease frequency

We do not currently have data on absolute disease frequency in the general population. Will need to gather.

Potential resources

bschilder commented 11 months ago

Assessing Orphanet data

See here for a complete rmarkdown report assessing the Orphanet prevalence data. https://neurogenomics.github.io/RareDiseasePrioritisation/reports/orphanet_prevalence

Takeaways:

NathanSkene commented 11 months ago

There’s various ways of getting at prevalence… one might be to look at data behind pLOF intolerance…. But it is not all that clear whether that would correspond with phenotype prevalence (eg the phenotype might be caused by CNV, or repeat expansions, or mis sense mutations).

I think it’s probably to big an ask for this paper. Would need a well thought out project proposal with ideas on how to validate. You agree?

bschilder commented 11 months ago

There’s various ways of getting at prevalence… one might be to look at data behind pLOF intolerance…. But it is not all that clear whether that would correspond with phenotype prevalence (eg the phenotype might be caused by CNV, or repeat expansions, or mis sense mutations). I think it’s probably to big an ask for this paper. Would need a well thought out project proposal with ideas on how to validate. You agree?

Mutation frequency and phenotype frequency (or disease frequency) are two related but very different concepts. Equating the two only works when the phenotype/disease is truly monogenic (a single causal gene across all individuals) and 100% penetrant (it always causes the phenotype/disease, regardless of genetic background or environmental exposures). That isn't the case for any of the phenotype for which we have celltype enrichment results we have, as we only included phenotypes with >=4 genes. Even if we did expand to phenotypes with n=1 gene, this doesn't preclude that there aren't more causal genes that haven't yet been discovered, or that penetrance isn't 100% (we have very limited data on this).

Currently, I propose we use frequency data from Orphanet as a guide when selecting top therapeutic candidates, rather than a hard requirement (i.e. removing all disease/phenotype candidates for which we don't have prevalence data).