cuelee / pleio

15 stars 6 forks source link

ldsc_preprocess inputs #12

Open rbutleriii opened 2 years ago

rbutleriii commented 2 years ago

Hello, I used SAIGE to generate my individual level sumstats, and it returns the BETA and SE for each phenotype. And a Tstat not a Z. Full stats:

CHR: chromosome
POS: genome position 
SNPID: variant ID
Allele1: allele 1
Allele2: allele 2
AC_Allele2: allele count of allele 2
AF_Allele2: allele frequency of allele 2
imputationInfo: imputation info. If not in dosage/genotype input file, will output 1
N: sample size
BETA: effect size of allele 2
SE: standard error of BETA
Tstat: score statistic of allele 2
p.value: p value (with SPA applied for binary traits)
p.value.NA: p value when SPA is not applied (only for binary traits)
Is.SPA.converge: whether SPA is converged or not (only for binary traits)
varT: estimated variance of score statistic with sample relatedness incorporated
varTstar: variance of score statistic without sample relatedness incorporated
AF.Cases: allele frequency of allele 2 in cases (only for binary traits and if --IsOutputAFinCaseCtrl=TRUE)
AF.Controls: allele frequency of allele 2 in controls (only for binary traits and if --IsOutputAFinCaseCtrl=TRUE)
Tstat_cond, p.value_cond, varT_cond, BETA_cond, SE_cond: summary stats for conditional analysis

Is there a way to input these directly into ldsc_precprocess.py rather than trying to compute a Z (probably with munge_sumstats.py)? Can I calculate the first table directly from these and use the genetic covariance and environmental correlation matrices from ldsc_preprocess.py?

cuelee commented 2 years ago

Thank you for your inquiry. And sorry for the late reply.

Unfortunately, the current code does not support SAIGE. The PLEIO code can use the input of LDSC by default.