jrs95 / hyprcoloc

Hypothesis Prioritisation in multi-trait Colocalization
https://jrs95.github.io/hyprcoloc/
GNU General Public License v3.0
46 stars 12 forks source link

How to interpret hyprcoloc results? #4

Closed Zepeng-Mu closed 4 years ago

Zepeng-Mu commented 4 years ago

Hi,

Thank you for the great tool. I have been trying to run the package on my own dataset but I am not sure how to interpret the results.

First, it seems that the function does not tell us whether the colocalization is significant or not, but we have to do that by ourselves? For example, when using non-uniform priors, I believe the cutoff is 0.7*0.7=0.49. In one of my results, SNP Posterior is 0.4818, the regional Posterior is 0.607, so does this mean that there is no colocalization based on the 0.49 threshold?

Second, sometimes the results contain NA, and I do not know what they exactly mean. For example, in one of my results, the regional Posterior is 0.91, but SNP Posterior is NA. What does this mean? What does this tell us about the colocalization status in this genomic region?

Many thanks!

jrs95 commented 4 years ago

Hi,

Thanks for the question.

I'm not sure there is a satisfactory way of saying a colocalization posterior is "significant". Although, @cnfoley might be more help here. I generally use a cut-off of around 0.75 to 0.8.

The alignment and regional thresholds for non-uniform priors are 0.5. I prefer non-uniform priors as they are more conservative than uniform priors. The result will be missing (be an NA) if one of the thresholds are not met. So, in your example there is evidence of regional overlap, but there is limited evidence of a single genetic driver between the traits (i.e. the region has not passed the alignment threshold).

Hope this helps.

Best wishes,

James

Zepeng-Mu commented 4 years ago

Thank you for your response. That is really helpful!

I also tried running it using snpscores = TRUE. This time there is no NA in the result at all, even though the posterior could be very small. Is this what the function is design to be? Or is it something that only I observe?

Thanks!!

cnfoley commented 4 years ago

Thanks James. Here's my input too:

Have a look at our updated tutorial/vignette for more information on this topic. Unfortunately, there is no strict way of defining a colocalization cut-off (i.e. a posterior probability above which we are willing to accept - potentially try and publish - the result that a cluster of traits colocalize). I have some rules of thumb for running an analysis and so does James (which he's mentioned and I agree with). However, while you might convince us with your results, others might disagree. To avoid this, we have introduced a new function "sensitivity.plot" which assesses the sensitivity of any results to: (i) the colocalization cut-off (using the algorithm - regional and alignment - thresholds) and; (ii) the setting of prior information (via the prior probability of colocalization "prior.2"). The function returns a heat-map and helps us to identify how the traits cluster when we vary the algorithm thresholds and the prior probability of colocalization.

I have given some advice about how to interpret these results in the vignette. In a nutshell: you will likely present your results for a single run of the algorithm, which specifies the regional and alignment thresholds as well as the prior information (e.g. reg.thresh = align.thresh = 0.7 and prior.2 = 0.98). This is your final clustering of the data. However, you should also present the results from "sensitivity.plot", this will illustrate how sensitive your conclusions are, about the final clustering, to the choice of input parameters - by varying these over multiple values, e.g. reg.thresh = align.thresh = c(0.6, 0.7, 0.8, 0.9) and prior.2 = c(0.98, 0.99, 0.995). Note that, as these values increase,identification of clusters of colocalized traits becomes more challenging (i.e. traits must have stronger evidence of colocalization in order to be detected).

There is a link to all of this at the top of the tutorial/vignette. You'll have to re-install/update hyprcoloc in R to use the function and browse the vignette.

Let me/James know how you get on.

Best wishes and good luck,

Chris


From: Zepeng (Phoenix) Mu notifications@github.com Sent: 10 October 2019 14:05 To: jrs95/hyprcoloc hyprcoloc@noreply.github.com Cc: Foley, Christopher christopher.foley@mrc-bsu.cam.ac.uk; Mention mention@noreply.github.com Subject: Re: [jrs95/hyprcoloc] How to interpret hyprcoloc results? (#4)

Thank you for your response. That is really helpful!

I also tried running it using snpscores = TRUE. This time there is no NA in the result at all, even though the posterior could be very small. Is this what the function is design to be? Or is it something that only I observe?

Thanks!!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/jrs95/hyprcoloc/issues/4?email_source=notifications&email_token=ALJAQVL7ELYLCRSITXXH5ATQN4SCXA5CNFSM4I6GMD6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA4GYFQ#issuecomment-540568598, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALJAQVN5W3PREAWWVTSM27TQN4SCXANCNFSM4I6GMD6A.

Zepeng-Mu commented 4 years ago

Thanks! I am trying to get the sensitivity plot, but got this error: Error in traits %in% tmp.clust : object 'traits' not found

It seems that it is in this line

tmp.vec = which(traits %in% tmp.clust);

in the function that "traits" is not defined. I tried the example from vignette, it works. I think this is because of "traits" was defined as a global variable before? When I change traits to "mytraits" for example, the same error happens.

Thank you so much!

cnfoley commented 4 years ago

Absolutely (and apologies). I'll fix this shortly, should read: "trait.names".

If you rename your vector of trait names to "traits" it should work, but it'll be fixed before 12pm BST today. Hopefully the heat plot will come in handy. If you need help interpreting it, just give me an email with a picture of the plot.

Best wishes,

Chris


From: Zepeng (Phoenix) Mu notifications@github.com Sent: 10 October 2019 20:01 To: jrs95/hyprcoloc hyprcoloc@noreply.github.com Cc: Foley, Christopher christopher.foley@mrc-bsu.cam.ac.uk; Mention mention@noreply.github.com Subject: Re: [jrs95/hyprcoloc] How to interpret hyprcoloc results? (#4)

Thanks! I am trying to get the sensitivity plot, but got this error: Error in traits %in% tmp.clust : object 'traits' not found

It seems that it is in this line

tmp.vec = which(traits %in% tmp.clust);

in the function that "traits" is not defined. I tried the example from vignette, it works. I think this is because of "traits" was defined as a global variable before? When I change traits to "mytraits" for example, the same error happens.

Thank you so much!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/jrs95/hyprcoloc/issues/4?email_source=notifications&email_token=ALJAQVM3Y4YZPMUB4TVOHPDQN53YTA5CNFSM4I6GMD6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA5OC7I#issuecomment-540729725, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALJAQVI7D5T4VQSO3GV43UDQN53YTANCNFSM4I6GMD6A.

cnfoley commented 4 years ago

FYI: issue should be sorted now.

Best of luck,

Chris


From: Foley, Christopher christopher.foley@mrc-bsu.cam.ac.uk Sent: 11 October 2019 09:32 To: jrs95/hyprcoloc hyprcoloc@noreply.github.com; jrs95/hyprcoloc reply@reply.github.com Cc: Mention mention@noreply.github.com Subject: Re: [jrs95/hyprcoloc] How to interpret hyprcoloc results? (#4)

Absolutely (and apologies). I'll fix this shortly, should read: "trait.names".

If you rename your vector of trait names to "traits" it should work, but it'll be fixed before 12pm BST today. Hopefully the heat plot will come in handy. If you need help interpreting it, just give me an email with a picture of the plot.

Best wishes,

Chris


From: Zepeng (Phoenix) Mu notifications@github.com Sent: 10 October 2019 20:01 To: jrs95/hyprcoloc hyprcoloc@noreply.github.com Cc: Foley, Christopher christopher.foley@mrc-bsu.cam.ac.uk; Mention mention@noreply.github.com Subject: Re: [jrs95/hyprcoloc] How to interpret hyprcoloc results? (#4)

Thanks! I am trying to get the sensitivity plot, but got this error: Error in traits %in% tmp.clust : object 'traits' not found

It seems that it is in this line

tmp.vec = which(traits %in% tmp.clust);

in the function that "traits" is not defined. I tried the example from vignette, it works. I think this is because of "traits" was defined as a global variable before? When I change traits to "mytraits" for example, the same error happens.

Thank you so much!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/jrs95/hyprcoloc/issues/4?email_source=notifications&email_token=ALJAQVM3Y4YZPMUB4TVOHPDQN53YTA5CNFSM4I6GMD6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEA5OC7I#issuecomment-540729725, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALJAQVI7D5T4VQSO3GV43UDQN53YTANCNFSM4I6GMD6A.

Zepeng-Mu commented 4 years ago

Thanks!

Then I got this error

Error in cut.default(x, breaks = breaks, include.lowest = T) :
  'breaks' are not unique
Calls: sensitivity.plot ... scale_colours -> matrix -> scale_vec_colours -> cut -> cut.default

Is this because I only have two datasets instead of multiple?

cnfoley commented 4 years ago

Hi,

I've fixed the issue, it was owing to how the colours were being set in the "pheatmap" package.

I've also added an extra option in the "sensitivity.plot" function which allows you to print the similarity matrix, i.e. a matrix in which the i'th and j'th elements represents the proportion of times the i'th and j'th trait colocalize across the range of threshold and prior values considered.

I've updated the tutorial to reflect this. Together with the heatmap, the similarity matrix should help you to quantify/gauge how reasonable the clusters of colocalized traits are in your data.

Best wishes,

Chris


From: Zepeng (Phoenix) Mu notifications@github.com Sent: 11 October 2019 21:08 To: jrs95/hyprcoloc hyprcoloc@noreply.github.com Cc: Foley, Christopher christopher.foley@mrc-bsu.cam.ac.uk; Mention mention@noreply.github.com Subject: Re: [jrs95/hyprcoloc] How to interpret hyprcoloc results? (#4)

Thanks!

Then I got this error

Error in cut.default(x, breaks = breaks, include.lowest = T) : 'breaks' are not unique Calls: sensitivity.plot ... scale_colours -> matrix -> scale_vec_colours -> cut -> cut.default

Is this because I only have two datasets instead of multiple?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/jrs95/hyprcoloc/issues/4?email_source=notifications&email_token=ALJAQVJFNNUHLPSMDYKFQ2LQODMNNA5CNFSM4I6GMD6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEBBCS3Y#issuecomment-541206895, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALJAQVMOGEIN34JICQF24J3QODMNNANCNFSM4I6GMD6A.