arq5x / gemini

a lightweight db framework for exploring genetic variation.
http://gemini.readthedocs.org
MIT License
317 stars 119 forks source link

gene list from `gemini burden` smaller than gene list from `gemini burden --calpha` #929

Closed mjsduncan closed 5 years ago

mjsduncan commented 5 years ago

when i ran gemini burden --nonsynonymous and gemini burden --nonsynonymous --calpha on a large db, the number of genes/lines of output were the same, but without the --nonsynynous parameter the gene list for the per-sample counts was 1/10 the size of the gene list for the c alpha test. is this a bug or am i missing something?

this is with gemini 0.30.1 installed via bc-bio.

thanks for your work!

mjsduncan commented 5 years ago

rtfm, using the --nonsynonymous flag invokes the same variant filter for burden and for burden --calpha (codon_change != 'None'), and not using the --nonsynonymous flag uses different filters (is_coding = 1 and (impact_severity = 'HIGH' or polyphen_pred = 'probably_damaging') vs impact_severity != 'LOW').