rgcgithub / regenie

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.
https://rgcgithub.github.io/regenie
Other
183 stars 53 forks source link

Number of ignored tests due to low MAC == 0 #531

Closed akhilpampana closed 2 months ago

akhilpampana commented 3 months ago

Hello,

I am trying to run burden testing based on four cohorts for non-coding variants, such as downstream variants using regenie. I have a few questions:

I am getting "Number of ignored tests due to low MAC: 0," although there is a warning: "WARNING: 1/3 masks fail MAC filter and will be skipped...done (24ms)" for a set of genes. Does this warning mean the test is ignored, or does it have nothing to do with the above statement?

I am losing a lot of genes because of the minimum MAC of 10. Is there a way to minimize that loss?

What would be the recommended minimum MAC cutoff to address the above issue? Since I want to perform a meta-analysis between the four cohorts and am losing many genes in one cohort, I am unsure how reliable the results are.

Is there a way to export p-values rather than log p-values?

Which method is the most reliable across SKAT, ACAT, and additive models for meta-analysis?

Please let me know regarding these questions so I can proceed with the analysis. I am confident in the analysis right now, but I want to be more certain if I can get the warning ignored.

Regards, Akhil

akhilpampana commented 3 months ago

I am getting different results when running three cohorts with favor annotations and one cohort with nirvana annotations. Is there a way to fix it?

joellembatchou commented 3 months ago

Hi Akhil,

You can use --minMAC to relax the minimum MAC filter (the default should be 5 not 10). You can set it to 1 so that only monomorphic masks will be discarded.

For your p-value questions, we report -log10P which are more numerically stable than the raw p-values for very small pvalues. It is staright-forward to convert between the two.

Can you be clear on what you mean by "reliable" when mentioning SKAT/ACAT/BURDEN?

Cheers, Joelle

akhilpampana commented 3 months ago

Hello Joelle,

Thank you so much for the response regarding my queries.

Regarding the question, I am planning to do a meta analysis based on 4 cohorts. I would like to know which method results would be useful for meta analysis and which would give reliable estimates for meta across skat/akat/burden? Also how are effect estimates were generated in regenie?

Regards Akhil

On Fri, Jun 28, 2024, 6:27 PM Joelle Mbatchou @.***> wrote:

Hi Akhil,

You can use --minMAC to relax the minimum MAC filter (the default should be 5 not 10). You can set it to 1 so that only monomorphic masks will be discarded.

For your p-value questions, we report -log10P which are more numerically stable than the raw p-values for very small pvalues. It is staright-forward to convert between the two.

Can you be clear on what you mean by "reliable" when mentioning SKAT/ACAT/BURDEN?

Cheers, Joelle

— Reply to this email directly, view it on GitHub https://github.com/rgcgithub/regenie/issues/531#issuecomment-2197766719, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEUJ56EYWGQZ6GM5JZOOSBLZJXWORAVCNFSM6AAAAABJQJ2CYSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJXG43DMNZRHE . You are receiving this because you authored the thread.Message ID: @.***>

joellembatchou commented 3 months ago

Burden gives you effect size & SE whereas SKAT/ACAT tests give you only p-values so this will affect the meta-analysis methods you can use on each. As for "reliability", I assume you are referring to calibration of the tests; all methods are well-calibrated (see their respective reference papers which are mentioned in the documentation).

For how REGENIE obtains effect size for burden masks, these are considered the same as single variants and so we use linear regression for QTs and logistic regression for BTs (including Firth penalty if using --firth). See doc for details.

Cheers, Joelle