ajaynadig / bhr

Suite of heritability and genetic correlation estimation tools for exome-sequencing data
MIT License
31 stars 6 forks source link

BHR for Singletons #3

Closed CharleyXia closed 1 year ago

CharleyXia commented 1 year ago

Hi,

I want to estimate burden heritability for singletons only. Can you give me more instructions on how to do so? In short, the input file of BHR needs to be a variant-level summary with a beta estimate per variant, which I assume is from GWAS? But for singletons, as there is only one observatoin in the popuatlion, I can't run an association test on that, thus don't have any betas. How can I prepare a variant-level summary for my singletons? Thanks.

ajaynadig commented 1 year ago

Hi Charley, Thanks for using BHR. Indeed, BHR requires variant-level betas, even for singletons. While the betas for singletons may be extremely noisy, they can still be computed. See below for some guidelines.

Case/Control Exome-Wide Association Study: Using equation 33 from the methods section of the BHR paper to calculate the per-sd effect sizes, then convert them to the per-allele effect sizes. Here is the relevant snippet from the tutorial:

#Compute sample prevalence, will be used to compute per-sd beta
  prevalence = n_cases/(n_cases+n_controls) 

  #Compute variant MAF in cases, will be used to compute per-sd beta
  table$AF_case = table$ac_case/(2*n_cases) 

  #Compute variant MAFoverall, will be used to compute per-sd beta
  table$AF = (table$ac_case + table$ac_ctrl)/(2*(n_cases + n_controls)) 

  #calculate per-sd betas
  table$beta = (2 * (table$AF_case - table$AF) * prevalence)/sqrt(2 * table$AF * (1 - table$AF) * prevalence * (1 - prevalence))  

  #calculate variant variances
  table$twopq = 2*table$AF * (1 - table$AF) 

  #convert betas from per-sd (i.e. sqrt(variance explained)) to per-allele (i.e. in units of phenotype) betas.
  #per-allele are the usual betas reported by an exome wide association study.
  table$beta_perallele = table$beta/sqrt(table$twopq)

Quantitative trait exome-wide association study: One could use a program such as SAIGE to calculate effect sizes for singletons; in our primary manuscript, we use betas from SAIGE. SAIGE controls for population stratification and boosts power using a mixed model, by conditioning on the genetic-relatedness matrix.

More simply, one could compute the OLS betas as follows:

\beta_perallele = (x^T %*% y ) / (x^T %*% x)

where y is the standardized quantitative phenotype, and x are the (non-standardized) genotypes. If you have additional covariates, X is a matrix, where the first column are the genotypes, and the remaining columns are covariates.

CharleyXia commented 1 year ago

I see! Thanks! That's very helpful.

ajaynadig commented 1 year ago

No problem! Closing this issue for now, please feel free to comment again if you have any other questions.