Closed saphir746 closed 3 years ago
Hi,
The error indicates that the variant rsZXY has no minor alleles (or very low minor allele count) in the sample analyzed. You can filter out such variants by specifying a minimum minor allele count filter in the PLINK2 command (e.g. --mac 100
). This should avoid the low variance error in step 1.
Also note that we recommend to run step 1 of Regenie on common variants after applying appropriate QC filters as the aim of this step is to capture genome-wide polygenic effects (similar to what is done in LMM methods). Then, using these predictions, you can run step 2 (association testing) on another set of variants (e.g. rare variants).
If you are analyzing binary traits, it has been noted that standard association tests with rare variants can be inflated. To avoid this, we have implemented SPA & Firth tests. They can be specified by options --spa/--firth
(see documentation for more details). Note that we also have an approximate Firth test which highly reduces the computational time of the standard Firth test and leads to very similar association results (use --firth --approx
).
Cheers, Joelle
Hello Joelle,
thanks a lot for the detailed answer. Running step 1 on common variants + normal QC then step 2 on rare variants makes sense. How would you do that in practice though? Would you produce 2 sets of bed / bgen files for each step, with different QC filters?
plink2 --bgen $bgen 'ref-unknown'\
--sample $sam \
--maf 0.01 --geno 0.1 --hwe 1e-15 \
--mind 0.1 \
--make-bed \
--out ukb_cal_step1
plink2 --bgen $bgen 'ref-unknown'\
--sample $sam \
--max-maf 0.01 --geno 0.1 --hwe 1e-15 \
--mind 0.1 \
--mac 100 \
--make-bed \
--out ukb_cal_step2
regenie --step 1 \
--bed ukb_cal_step1 \
...
--out ukb_step1
regenie --step 2 \
--bed ukb_cal_step2 \
...
--pred ukb_step1_pred.list \
--out ukb_step2
Hi,
Regenie can handle files in BGEN or PLINK bed format for step 1/2. Also, there is a --extract/--exclude
option that works for step 1 and 2 so when applying the appropriate filters for step 1 and 2 in PLINK2, use option --write-snplist
so you can get the list of variants that pass the filters and avoid making a new genotype file.
So based on the example you included above,
plink2 --bgen $bgen 'ref-unknown'\
--sample $sam \
--maf 0.01 --geno 0.1 --hwe 1e-15 \
--mind 0.1 \
--write-snplist \
--out ukb_cal_step1
plink2 --bgen $bgen 'ref-unknown'\
--sample $sam \
--max-maf 0.01 --geno 0.1 --hwe 1e-15 \
--mind 0.1 \
--mac 100 \
--write-snplist \
--out ukb_cal_step2
regenie --step 1 \
--bgen $bgen \
--extract ukb_cal_step1.snplist \
...
--out ukb_step1
regenie --step 2 \
--bgen $bgen \
--extract ukb_cal_step2.snplist \
...
--pred ukb_step1_pred.list \
--out ukb_step2
Note that for the PLINK2 command for step 1, I'd recommend also adding a --mac
filter to ensure there are no variants with too low minor allele counts.
Cheers, Joelle
hello Joelle,
Thank you for the suggestions. With the following syntax, it worked:
regenie --step 1 \
--bed ukb_cal_ChrXXX \
--extract ukb_cal_step1.snplist \
.....
--out ukb_step1_cont
regenie --step 2 \
--bgen $bgen \
--sample $sam \
--extract ukb_cal_step2.snplist \
--pred ukb_step1_cont_pred.list \
....
--out ukb_step2
(Actually there was a problem with the processing of the bgen file, but thats a separate issue)
Hello,
can you advice on the best way to perform rare variants (MAF < 1%) analysis in Regenie? Can this be supported via SPA / Frith correction? So far I have tried the following:
But I get this:
I am running Regenie v. 1.0.6.7 inside a conda env