rgcgithub / regenie

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.
https://rgcgithub.github.io/regenie
Other
174 stars 52 forks source link

build GitHub release (latest by date) install with conda [Github All Releases]() License: MIT

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.

It is developed and supported by a team of scientists at the Regeneron Genetics Center.

The method has the following properties

Full documentation for the regenie can be found here.

Citation

Mbatchou, J., Barnard, L., Backman, J. et al. Computationally efficient whole-genome regression for quantitative and binary traits. Nat Genet 53, 1097–1103 (2021). https://doi.org/10.1038/s41588-021-00870-7

License

regenie is distributed under an MIT license.

Contact

If you have any questions about regenie please contact

If you want to submit a issue concerning the software please do so using the regenie Github repository.

Version history

Version 3.5 (Added CHR/POS columns to snplist output file when using --write-mask-snplist; Genotype counts are now reported in the sumstats file when using --no-split; Improved efficiency of LOOCV scheme in ridge level 0; Detect carriage return in fam/psam/bim/pvar/sample files; Minor bug fixes)

Version 3.4.1 (Reduction in memory usage for LD computation when writing to text files; Fix bug rejecting valid PVAR files)

Version 3.4 (Reduction in memory usage for LD computation with dosages; Minor bug fixes for LD computation; Bug fix for when carriage returns are in optional input files)

Version 3.3 (Faster implementation of approximate Firth LRT; New strategy for approximate Firth LRT with ultra-rare variants; Relaxed convergence criterion of Firth LRT from 1E-4 to 2.5E-4)

Version 3.2.9 (Switch to robust version of ACAT to handle very small p-values; Bug fix for Step1 when sex chromosome was included in the analysis; Allow for 64 domains when using the 4-column annotation file)

Version 3.2.8 (New option --bgi to specify custom index bgi file accompagnying BGEN file; Relax matching criteria between BGEN and index bgi files to use CPRA instead of variant ID)

Version 3.2.7 (New option --force-mac-filter to apply different MAC filter to subset of SNPs; Extend maximum number of domains to 32 for 4-column anno-file; Update PGEN library)

Version 3.2.6 (Relax tolerance parameter for null unpenalized logistic regression from 1e-8 to 1e-6; Minor bug fixes)

Version 3.2.5.3 (Fix inflation issue when testing main effect of SNP in GxE model; Minor bug fixes)

Version 3.2.5 (Use pseudo-data representation algorithm as default in step 2 single variant tests; Use ACAT to get SBAT p-value across POS/NEG models; Bug fix for ACATV when set has a single variant with zero weight)

Version 3.2.4 (Relaxed the requirement on the minimum number of unique values for QTs to 3; Various bug fixes)

Version 3.2.3 (Address convergence issues in Firth regression; Various bug fixes)

Version 3.2.2 (New columns in sumstats file (N_CASES/N_CONTROLS) to output the number of cases/controls when using --af-cc; Various bug fixes)

Version 3.2.1 (New option --lovo-snplist to only consider a subset of LOVO masks; Improve efficiency of LOVO for large sets to reduce memory usage; Bug fix for SPA with numerical overflow; For SKAT/ACAT tests with Firth correction, don't include SKAT weights when running Firth on single variants)

Version 3.2 (Bug fix for SKAT/SKATO when testing on binary traits using Firth/SPA; Switched name of NNLS joint test to SBAT test altering name of corresponding options and applied Bonferroni correction before reporting its p-value [correcting for minP of 2 tests])

Version 3.1.4 (New option --par-region to specify build to determine bounds for chrX PAR regions; new option --force-qt to force QT runs for traits with fewer than 10 values [otherwise will throw an error]; phenotype imputation for missing values is now applied after RINTing when using --apply-rint; several bug fixes)

Version 3.1.2 (Reduction in memory usage for SKAT/SKATO tests; Bug fix for LOVO with SKAT/ACAT tests; Improvements for null Firth logistic algorithm to address reported convergence issues)

Version 3.1.1 (Reduction in memory usage for SKAT/SKATO tests; Improvements for logistic regressions algorithms to address reported convergence issues)

Version 3.1 (Fixed bug in SKAT/SKATO tests when applying Firth/SPA correction; Improved SPA implementation by computing both tail probabilities; New option --set-singletons to specify variants to consider as singletons for burden masks; New option --l1-phenoList to run level 1 models in Step 1 in parallel across phenotypes; Several bug fixes)

Version 3.0.3 (Skip BTs where null model fit failed; Bug fix for BURDEN-ACAT; Bug fix when nan/inf values are in phenotype/covariate file)

Version 3.0.1 (Improve ridge logistic regression in Step 1; Add compilation with Cmake)

Version 3.0 (New gene-based tests: SKAT, SKATO, ACATV, ACATO and NNLS [Non-Negative Least Square test]; New GxE and GxG interaction testing functionality; New conditional analysis functionality; see release page for minor additions)

For past releases, see here.