rgcgithub / regenie

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.
https://rgcgithub.github.io/regenie
Other
187 stars 55 forks source link

Questions on the GRM or kinship #94

Closed Jia21 closed 3 years ago

Jia21 commented 3 years ago

Dear REGENIE team,

I am wondering whether REGENIE could handle the GRM or kinship matrix problem. As I read the paper of REGENIE, the formula did consider the genetic relatedness matrix. But in the REGENIE input, we actually don't need to pre-calculate GRM or kinship matrix and input the kinship matrix or GRM, therefore I am wondering RENEGIE could handle this problem inside of the algorithm.

Thanks a lot! Elaine

joellembatchou commented 3 years ago

Hi Elaine,

We have addressed this question in the FAQ tab of the Regenie website.

Cheers, Joelle

Jia21 commented 3 years ago

Thank you, that's great helpful!

brettva commented 2 years ago

@joellembatchou

Apologies for being dense, but does this mean that similar to SAIGE you don't have to exclude related samples prior to analysis. I am trying to understand if that is what is meant by this sentence in the FAQ:

"we bypass having to use the GRM K and use the polygenic effect estimates X^β to control for population structure when testing variants for association."

Thank you so much, the tool has been very user friendly and the documentation is great!

joellembatchou commented 2 years ago

Hi,

REGENIE accounts for population structure and relatedness in Step 1 through the estimation of a genome-wide polygenic effect component from the whole genome regression model. The paper includes simulation results with various amounts of relatedness present (see Extended Data Figs. 5 & 6).

Cheers, Joelle

Cui-yd commented 1 year ago

Hi Joelle,

Thank you for providing such a valuable tool and comprehensive documentation. As a newcomer to genomics, I have a couple of questions about the GRM in REGENIE:

  1. I've been reading about the GCTA literature (DOI: https://doi.org/10.1016/j.ajhg.2010.11.011) recently. Are there any similarities or commonalities between GCTA and REGENIE in terms of their approaches to variance equation (introduction Equation 2, they use a similar format like FAQ-general)?

  2. You mentioned that "REGENIE accounts for relatedness in Step 1 from the whole genome regression model". As I know that REGENIE estimates parameters block by block, could this approach potentially impact the estimation of a genome-wide polygenic effect?

Thank you in advance for your time and expertise. I appreciate your assistance.

Amber

joellembatchou commented 12 months ago

Hi Amber,

  1. Yes GCTA uses a linear mixed model approach (integrating out the SNP effects) whereas REGENIE uses whole genome regression (directly modeling the SNP effects)
  2. In REGENIE step 1, the level 1 model includes all the level 0 block predictors.

Cheers, Joelle