statgen / EPACTS

GNU General Public License v3.0
34 stars 20 forks source link

VT vtns variant information does not match VT epacts summary #39

Open erampersaud opened 2 years ago

erampersaud commented 2 years ago

We noticed that for any gene-based test result from EPACTS VT, that the number of individuals carrying variants (Burden_CNT) do not match with the .vnts files.

1) Is this a bug?

2) there a way to extract the individuals and their genotypes that contribute to the gene-based test results (taking phenotype and covariates into account)?

Example: VT gene burden output Number of individuals carrying variant in gene GRB14 = 8

CHROM BEG END MARKER_ID TOT_MARKERS PASS_MARKERS BURDEN_CNT FRAC_WITH_RARE STAT PVALUE R2 DIRECTION OPT_THRES_RAC OPT_FRAC_WITH_RARE

2 164493140 164621279 2:164493140-164621279_GRB14 17 6 8 0.02632 8.3771 0.0101 0.02756 - 1 0.01645

Grepped for GRB14 across all *.epacts.vnts files = 5 ; 3 are missing. ALSF_slope_VT.12.epacts.vnts:2:164493140-164621279_GRB14 2:164493140_G/A_rs144301087 0.00164 1 1 Sample055253_G1 0/1 ALSF_slope_VT.12.epacts.vnts:2:164493140-164621279_GRB14 2:164508510_C/T 0.00164 1 1 Sample050255_G1 0/1 ALSF_slope_VT.12.epacts.vnts:2:164493140-164621279_GRB14 2:164527109_C/T 0.00169 1 1 Sample050251_G1 0/1 ALSF_slope_VT.12.epacts.vnts:2:164493140-164621279_GRB14 2:164547710_A/G_rs151334452 0.00164 1 1 Sample064223_G1 0/1 ALSF_slope_VT.12.epacts.vnts:2:164493140-164621279_GRB14 2:164547749_C/T_rs201183506 0.00164 1 1 Sample060583_G1 0/1