zhanxw / rvtests

Rare variant test software for next generation sequencing data
126 stars 41 forks source link

What do the duplicated lines in burden test with covariates output mean ? #134

Open marlenebrs opened 2 years ago

marlenebrs commented 2 years ago

I performed a CMC Wald burden test on my phenotype "EP" adjusted on 6 covariates : sex, AGE, PC1, PC2, PC3 and BMI. My output shows for each gene 7 duplicated lines. What does each line mean? Does a line correspond to a fit on all covariates?

Here is a part of my output :

Gene    RANGE   N_INFORMATIVE   NumVar  NumPolyVar  NonRefSite  Beta    SE  Pvalue
FAM87B  1:817370-819834 69  41  41  41  -0.502387   0.568318    0.376702
FAM87B  1:817370-819834 69  41  41  41  1.61057 0.69236 0.0200079
FAM87B  1:817370-819834 69  41  41  41  -0.00472976 0.0244506   0.846614
FAM87B  1:817370-819834 69  41  41  41  118.706 187.67  0.527045
FAM87B  1:817370-819834 69  41  41  41  -189.578    446.778 0.671331
FAM87B  1:817370-819834 69  41  41  41  -96.8701    127.91  0.448851
FAM87B  1:817370-819834 69  41  41  41  -0.023334   0.0482548   0.628699
LINC00115   1:826205-827522 69  29  29  29  -0.44987    0.587963    0.444192
LINC00115   1:826205-827522 69  29  29  29  1.58764 0.690461    0.0214826
LINC00115   1:826205-827522 69  29  29  29  -0.00530215 0.0245051   0.8287
LINC00115   1:826205-827522 69  29  29  29  104.702 183.388 0.568046
LINC00115   1:826205-827522 69  29  29  29  -155.349    439.247 0.723585
LINC00115   1:826205-827522 69  29  29  29  -93.4759    126.892 0.461333
LINC00115   1:826205-827522 69  29  29  29  -0.0230046  0.0480832   0.632341

Here is my command line :

rvtest --inVcf input.vcf.gz --pheno pheno.ped --pheno-name EP --covar covars.ped --covar-name sex,AGE,PC1,PC2,PC3,BMI --out output --geneFile refFlat_hg38.txt.gz --burden cmcWald