mgalardini / pyseer

SEER, reimplemented in python 🐍🔮
http://pyseer.readthedocs.io
Apache License 2.0
104 stars 25 forks source link

burden testing, gene length #250

Closed dustinlong closed 8 months ago

dustinlong commented 8 months ago

Is there any adjustment for gene length (kernel size) in the burden testing function? For example, would the following cases result in equivalent outputs?

2 non-synonymous mutations in cases and 0 non-synonymous mutations, in controls in a gene 10000 bases long 2 non-synonymous mutations in the same cases and 0 non-synonymous mutations in the same controls, in a gene 100 bases long

Thank you!

Dustin

johnlees commented 8 months ago

Good question – no there is not adjustment, these cases would be identical. We don't count the number of mutations, any observation of even a single mutation causes a count of '1'. The burden test is therefore designed more for rare loss of function variants which are strongly expected to have the same effect on the gene (frameshift, early stop etc). For more sophisticated analyses with non-syn variants perhaps something like SKAT might be better

dustinlong commented 8 months ago

Thank you for the quick response--very helpful to know!

Dustin