weizhouUMICH / SAIGE

GNU Lesser General Public License v3.0
187 stars 72 forks source link

Coding sequencing batch as a covariate #388

Closed tylim closed 2 years ago

tylim commented 2 years ago

Hi Dr. Zhou,

Thank you for developing SAIGE.

I'm new to running SAIGE-gene and would like to add sequencing batch as a covariate. May I know if these batches could be coded as a categorical variable (either as "batch1" or numeric factors) or do they need to be in a form of a design matrix where each batch is separated into different columns coded with either 1 or 0 in the covariate file as in the example below?

FID     IID     sex    pheno    batch1  batch2  batch3  PC1     PC2
1a1    1a1    2    1    1    0    0     -0.03750248    -0.01803618
1a2    1a2    2    1    0    1    0    -0.001987663    0.006452682
1a3    1a3    1    0    0    0    1    0.0274171     0.005185112
1a4    1a4    2    0    1    0    0    -0.02953894    -0.008644975

Cheers, Tze Yin

weizhouUMICH commented 2 years ago

Hi Tze,

A form of a design matrix is needed.

Thanks, Wei