hanchenphd / GMMAT

Generalized linear Mixed Model Association Tests
Other
36 stars 22 forks source link

GMMAT (Generalized linear Mixed Model Association Tests)

Description

GMMAT is an R package for performing association tests using generalized linear mixed models (GLMMs, see Breslow and Clayton (1993)) in genome-wide association studies (GWAS) and sequencing association studies. First, GMMAT fits a GLMM with covariate adjustment and random effects to account for population structure and familial or cryptic relatedness. For GWAS, GMMAT performs score tests for each genetic variant as proposed in Chen et al. (2016). For candidate gene studies, GMMAT can also perform Wald tests to get the effect size estimate for each genetic variant. For rare variant analysis from sequencing association studies, GMMAT performs the variant Set Mixed Model Association Tests (SMMAT) as proposed in Chen et al. (2019), including the burden test, the sequence kernel association test (SKAT), SKAT-O and an efficient hybrid test of the burden test and SKAT, based on user-defined variant sets. See the vignette here.

Installing

GMMAT links to R packages Rcpp and RcppArmadillo, and also imports R packages Rcpp, CompQuadForm, foreach, parallel, Matrix, methods, and data.table. GMMAT requires Bioconductor packages SeqArray and SeqVarTools to work with genotype files in the GDS format. In addition, GMMAT requires testthat to run code checks during development, and doMC to run parallel computing in glmm.score and SMMAT functions for genotype files in the GDS format (however, doMC is not available on Windows and these functions will switch to a single thread). These dependencies should be installed before installing GMMAT. See Section 3.2 of the vignette.

For optimal computational performance, it is recommended to use an R version configured with the Intel Math Kernel Library (or other fast BLAS/LAPACK libraries). See the instructions on building R with Intel MKL.

Version

The current version is 1.4.2 (November 17, 2023).

License

This software is licensed under GPL (>= 3).

Contact

Please refer to the R help document of GMMAT for specific questions about each function. For comments, suggestions, bug reports and questions, please contact Han Chen (Han.Chen.2 AT uth.tmc.edu). For bug reports, please include an example to reproduce the problem without having to access your confidential data.

Acknowledgments

Duy T. Pham implemented support for BGEN genotype files and SeqVarGDSClass objects. We would like to thank Dr. Chaolong Wang and Dr. Brian Cade for comments and suggestions on GMMAT and the user manual. We would also like to thank Dr. Matthew Conomos for help with the Average Information REML algorithm, Dr. Stephanie Gogarten for help with the GDS genotype format, Jennifer Brody for help with parallel computing and App development in Analysis Commons, a cloud computing platform, Arthur Gilly for supporting reordered group definition files in SMMAT.meta, and Dr. Rounak Dey for supporting imputed dosage GDS files. The GMMAT implementation is supported by NIH grant R00 HL130593, and the analysis pipeline implementation (the gmmat App) in Analysis Commons is supported by NIH grant U01 HL120393.

References

If you use the single-variant test in GMMAT, please cite

  • Chen H, Wang C, Conomos MP, Stilp AM, Li Z, Sofer T, Szpiro AA, Chen W, Brehm JM, Celedón JC, Redline S, Papanicolaou GJ, Thornton TA, Laurie CC, Rice K, Lin X. (2016) Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models. The American Journal of Human Genetics 98(4): 653-666. PMID: 27018471. PMCID: PMC4833218. DOI: 10.1016/j.ajhg.2016.02.012.
  • If you use the variant set tests SMMAT, please cite

  • Chen H, Huffman JE, Brody JA, Wang C, Lee S, Li Z, Gogarten SM, Sofer T, Bielak LF, Bis JC, Blangero J, Bowler RP, Cade BE, Cho MH, Correa A, Curran JE, de Vries PS, Glahn DC, Guo X, Johnson AD, Kardia S, Kooperberg C, Lewis JP, Liu X, Mathias RA, Mitchell BD, O'Connell JR, Peyser PA, Post WS, Reiner AP, Rich SS, Rotter JI, Silverman EK, Smith JA, Vasan RS, Wilson JG, Yanek LR, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, TOPMed Hematology and Hemostasis Working Group, Redline S, Smith NL, Boerwinkle E, Borecki IB, Cupples LA, Laurie CC, Morrison AC, Rice KM, Lin X. (2019) Efficient Variant Set Mixed Model Association Tests for Continuous and Binary Traits in Large-Scale Whole-Genome Sequencing Studies. The American Journal of Human Genetics 104(2): 260-274. PMID: 30639324. PMCID: PMC6372261. DOI: 10.1016/j.ajhg.2018.12.012.