cumc / xqtl-protocol

Molecular QTL analysis protocol developed by ADSP Functional Genomics Consortium
https://cumc.github.io/xqtl-protocol/
MIT License
38 stars 42 forks source link

SuSiE beta/se are uncorrelated with that of TensorQTL #445

Closed hsun3163 closed 1 year ago

hsun3163 commented 1 year ago

For gene ABCA7, with MAC =5, there are 3260 beta/se generated by SuSiE, 1588 nominal beta/se from tensorQTL. The overlap is 1559.

Among the 1559 sumstat, the correlation of beta, se, and z scores are -0.0171467155186315, -0.0168848274367687, 0.00146767466225054

respectfully.

This indicates something could be seriously wrong with the susie analysis, which in turns explains the lack of any cs.

The code used for the comparison and diagnosis will be posted later once pushed to github repo brain-xqtl-analysis.

gaow commented 1 year ago

@hsun3163 I think the first thing to check is whether data is loaded correctly. Remember previously we have compared tensorQTL, APEX and SuSiE. We found decent consistency between tensorQTL and SuSiE.

hsun3163 commented 1 year ago

For the genotype data, since the variants can be match, so there are no misloading of data For the phenotype data, I have checked that the correct gene are loaded For the covariates data, we loaded the wrong one previously but changing it to the correct one didn't seems to improve

gaow commented 1 year ago

Can we check with some interactive code for tensorQTL analyzing one gene?

hsun3163 commented 1 year ago

As it turned out, this issue is due to covariates file being different. However, while using the correct covariate file. All coef from susie are 0

hsun3163 commented 1 year ago

When the first 62 rows of the correct covariates file was used, SuSiE can detect some really weak signal, similar to what we have previously with the incorrect covariates, as shown in the image below:

Annotation 2022-10-13 105502

When the first 63 rows was included, all the signal are gone.

Annotation 2022-10-13 105559

gaow commented 1 year ago

Hmm what's special about covariate 63? What if we just adjust for that covariate? Still the primary issue is why susieR and tensorQTL give different results here when in your initial assessment of pseudo-bulk they are similar. Did you try to analyze this region also using tensorQTL?

hsun3163 commented 1 year ago

Hmm what's special about covariate 63? What if we just adjust for that covariate? Still the primary issue is why susieR and tensorQTL give different results here when in your initial assessment of pseudo-bulk they are similar. Did you try to analyze this region also using tensorQTL?

Yes, tensorQTL give the same result as shrishtee's

hsun3163 commented 1 year ago

After swapping out the covariates file, the result using un-residualized X and Y also changed which should not have. The only thing this two will be impacted on is taking the intersection of samples with the covariate file.

After fixing the sample scrambling issue, the problem is fixed and the proper CS can be called, consistent with previous finding.

The Z score are also consistent that of tensorQTL, but the beta have only a correlation of 0.33

cor(merged_result$susie_z,merged_result$beta/merged_result$se)
cor(merged_result$susie_beta,merged_result$beta)
cor(merged_result$se,merged_result$susie_se)
0.999999955207208
0.330943812142217
0.00589294820597516
hsun3163 commented 1 year ago

Following shown the relationship for beta/se for the two analysis. I imagine they are the different due to how susie estimate the beta?

Annotation 2022-10-13 105936

gaow commented 1 year ago

@hsun3163 a better comparison is the z-scores inferred by SuSiE. Here SuSiE beta is the posterior estimate of beta; the tensorQTL beta is the "data" itself

hsun3163 commented 1 year ago

Got it, the z score are exactly identical, only differed due to having different decimal places