chr1swallace / coloc

Repo for the R package coloc
139 stars 44 forks source link

coloc.abf result: unexpectedly high PP.H4.abf #111

Open QingningWang2022 opened 1 year ago

QingningWang2022 commented 1 year ago

Dear Wallace,

A colleague of mine conducted a colocalisation analysis using the coloc.abf() function allowing only 1 causal variant shared between 2 traits. Below is an example script they have used. We have observed unexpectedly high PP.H4.abf (99%) for a locus where the p values of one of the gwas summary stats dataset (trait2) are all > 0.05. Given all pvalues for trait2 are > 0.05, we did not expect to see evidence for H4. Could you help us understand why coloc.abf() would still generate evidence for H4 given our data? Thanks in advance.

------- script used:

...

 minimum_ccdata1=trait1[c("beta","varbeta","MarkerName","BP")]
  minimum_ccdata1$type="cc"

  minimum_ccdata2=trait2[c("beta","varbeta","MarkerName","BP")]
  minimum_ccdata2$type="cc"

res = coloc.abf(minimum_ccdata1.new, minimum_ccdata2)

print(res)

----end of script. Below is the standard output:

Coloc analysis of trait 1, trait 2

SNP Priors p1 p2 p12 1e-04 1e-04 1e-05

Hypothesis Priors H0 H1 H2 H3 H4 0.9659324 0.0161 0.0161 0.0002576 0.00161

Posterior nsnps H0 H1 H2 H3 H4 1.610000e+02 1.526353e-37 2.131144e-14 5.558892e-26 6.768276e-03 9.932317e-01

chr1swallace commented 1 year ago

Could you show me what the data look like? Agree this shouldn't happen if your p values accurate. If you do pnorm(-abs(beta)/sqrt(varbeta), lower=FALSE)*2, are they the p values you expect?

https://chr1swallace.github.io


From: QingningWang2022 @.> Sent: Thursday, January 5, 2023 5:12:29 PM To: chr1swallace/coloc @.> Cc: Subscribed @.***> Subject: [chr1swallace/coloc] coloc.abf result: unexpectedly high PP.H4.abf (Issue #111)

Dear Wallace,

A colleague of mine conducted a colocalisation analysis using the coloc.abf() function allowing only 1 causal variant shared between 2 traits. Below is an example script they have used. We have observed unexpectedly high PP.H4.abf (99%) for a locus where the p values of one of the gwas summary stats dataset (trait2) are all > 0.05. Given all pvalues for trait2 are > 0.05, we did not expect to see evidence for H4. Could you help us understand why coloc.abf() would still generate evidence for H4 given our data? Thanks in advance.

------- script used:

...

minimum_ccdata1=trait1[c("beta","varbeta","MarkerName","BP")] minimum_ccdata1$type="cc"

minimum_ccdata2=trait2[c("beta","varbeta","MarkerName","BP")] minimum_ccdata2$type="cc"

res = coloc.abf(minimum_ccdata1.new, minimum_ccdata2)

print(res)

----end of script. Below is the standard output:

Coloc analysis of trait 1, trait 2

SNP Priors p1 p2 p12 1e-04 1e-04 1e-05

Hypothesis Priors H0 H1 H2 H3 H4 0.9659324 0.0161 0.0161 0.0002576 0.00161

Posterior nsnps H0 H1 H2 H3 H4 1.610000e+02 1.526353e-37 2.131144e-14 5.558892e-26 6.768276e-03 9.932317e-01

— Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fchr1swallace%2Fcoloc%2Fissues%2F111&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7Ce508623354664533f9f808daef401a27%7C49a50445bdfa4b79ade3547b4f3986e9%7C1%7C0%7C638085355838224488%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dylIS6VJgKOvDkY%2F%2BVCpXSPB8Y9SvmrRvADgzbvB7rk%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAQWR2HQ4NTZYOICHXZAWBLWQ36H3ANCNFSM6AAAAAATSFYD2U&data=05%7C01%7Ccew54%40universityofcambridgecloud.onmicrosoft.com%7Ce508623354664533f9f808daef401a27%7C49a50445bdfa4b79ade3547b4f3986e9%7C1%7C0%7C638085355838224488%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=B6g0bvbkCd%2BaFrnrsmVGzUw3jHXhRA2uRcyZl5dHEqU%3D&reserved=0. You are receiving this because you are subscribed to this thread.Message ID: @.***>

QingningWang2022 commented 1 year ago

Hi Chris, we have figured out why we were getting unexpected results and it had nothing to do with the p values but how the summary statistics files for each locus were numbered (misalignment of loci and lead variant led us questioning the validity of coloc result in the first place). After the loci numbers are reordered, the result makes much more sense now. Thanks again for your timely response.

QingningWang2022 commented 1 year ago

FYI, we didn't supply P values to coloc.abf(). We only provided beta and varbeta.

akhilpampana commented 2 months ago

Hello @QingningWang2022 Hope you are doing well. May I know how the files are formatted? I am getting unreasonably low H3 values (0 for all genes) when trying to run coloc based on gwas summary statistics and eqtl dataset-based summary data from gtex. Have you observed this before?