Closed Ojami closed 2 years ago
Hi Oveis,
Yes, Regenie uses AAF to determine which variants will go into the set-based tests. This should not be an issue when using reference genome as the reference allele is usually major (all the more if narrowing down to rarer variation as done in the paper you referenced).
For the --vc-maskAAF
option, indeed the default upper bound used is 100% (=1) meaning all variants go into the test regardless of AAF. Similarly to the --aaf-bins
option, you should specify the absolute AAF (e.g. --vc-maxAAF 0.01
to have only variants with AAF below 1% in the test).
Cheers, Joelle
Hi Joelle,
Thanks for clarification. In the example above, those two variants are indeed from PLINK BED files (PLINK --freq counts
), and both are rare. As seen, the ALT is not the minor allele for the first variant; however, I admit that this is not mostly the case, and for the majority of variants ALT == minor (at least in case of UKB WES data vcf -> BED from DNAnexus).
Nonetheless, with having MA instead of ALT, one doesn't need flag --singleton-carrier
, and it would be more intutive when using MAC specific options (e.g. --vc-MACthr
).
I close this issue, since it doesn't greatly affect the output summary stats anyway.
Best/Oveis
Hello @joellembatchou ,
We are setting up rare variant burden test with Regenie for UKBB WES data, and came across the same issue described here (that for some variants A1 is not the minor allele even when using reference genome), although as stated above, it is quite rare. Still I wanted to check, has there been any change about this in Regenie since last year? If I understand correctly, even if we provide an AAF file manually where we replace the AAF as (1-AAF) for those cases so that they go into the aaf-bins, the beta and A1FREQ would still refer to the major allele, so basically we should not do that, right? So the only solution if we really want all rare variants (e.g. with AAF>99.99% and AAF<0.01%) would be to recode the alleles? Thank you!
Best, Burulca
Hi Burulca,
No change has been done in the REGENIE software. As you stated, the AAF file is only to specify which variants goes into a mask but does not flip the alleles when computing the mask so yes, recoding the alleles would be an alternate solution.
Cheers, Joelle
Hi Joelle,
I was wondering why REGENIE relies on AAF instead of MAF in region-/gene-tests? MAF seems more intuitive compared to AAF (similar to what SAIGE-GENE does)? This can be problematic when AA is not the MA (or am I wrong?)
As an example:
21:10413783:A:G
and21:10413787:C:T
cannot be used in the same mask (by setting--aaf-bins
), if one's interested in testing only rare variants (MAF < 1%).Here, I can see that the authors mentioned:
To me, it seems they mixed up AAF with MAF. The only (?) workaround is
--aaf-file
, where user flips REF/ALT of the variants with an AAF > 0.5, and uses MAF (1 - AAF) for those variants. This essentially means, the user should go against this part:On a different note, the description of
--vc-maxAAF
optional argument states:Does this mean this cutoff should be in % and not absolute AAF (e.g. if one wants rare below AAF of 0.01, s/he should set this option as 1)?
Thanks/Oveis