jrs95 / hyprcoloc

Hypothesis Prioritisation in multi-trait Colocalization
https://jrs95.github.io/hyprcoloc/
GNU General Public License v3.0
46 stars 12 forks source link

Missing trait in dropped traits #8

Closed ngbowker closed 4 years ago

ngbowker commented 4 years ago

Hi,

I've been using Hyprcoloc for several aspects of my work and it's immensely useful, however, I keep running into an issue where not all traits of interest are listed in the "dropped traits" column. In my example, I am looping the algorithm over several loci with the following traits of interest: datasets <- c("res_invn_x16292_288","res_invn_x5755_29","bmi","ldl","t2d","chd","mdc0","mdc120","glucose","hba1c","hdl","total_chol","triglycerides","lipoprotein_a","apoa1","apob","whradjbmi","hcadjbmi","wcadjbmi","whr","hc","wc","twohrgluadjbmi","fgluadjbmi","finsadjbmi","hba1cadjfglu")

However, it seems that "bmi" is not included in dropped traits in any of my analyses. It is not included in the "traits" column either (as an aside). I've tried to work this out by running the analysis in regions where I would definitely expect a cluster with BMI and this doesn't seem to help. I have checked through the summary stats that I'm using for each locus (extracted from the larger summary stats file) and nothing appears to be wrong with them either.

I would expect "bmi" to appear in dropped traits if it wasn't added to a cluster, however, I don't expect it to be missing completely. Interestingly, it does appear in the sensitivity plot that I run as part of the same loop. It would be great if you could provide an explanation for this as I have checked through everything I can think of and I can't find a suitable explanation?

Thanks in advance,

Nick

ngbowker commented 4 years ago

Further to this, I wanted to mention that I have checked the results using COLOC and in one of the regions in particular, there appears to be a colocalisation with one of the traits of interest. This was tested using the same summary stats files I've been using for hyprcoloc.

Thanks,

Nick

jrs95 commented 4 years ago

Hi Nick,

Thanks for the query.

If there is a trait remaining after all of the other traits have been clustered or dropped this won't appear in the dropped_trait column, as to why this trait would always be "bmi" in your analysis I cannot say.

@cnfoley could you shed anymore light on what is potentially going on here? My guess is that we would need access to the data to find out precisely what is going on?

Best wishes,

James

ngbowker commented 4 years ago

Hi James,

Thanks for getting back to me. Yes I've noticed that whenever all of the other traits have been clustered or dropped one of the traits won't appear in the dropped_trait column. More often than not this does seem to be BMI though. This is odd, particularly at loci where I would expect BMI and my other traits to colocalise (and they do according to COLOC). If you could shed some light on the issue it would help a lot. I can share my data if you need?

Best wishes,

Nick

jrs95 commented 4 years ago

Hi Nick,

Thanks for the update.

Would you mind running hyprcoloc with the BB algorithm turned off (i.e. bb.alg=F) for one of the regions that you think BMI should colocalise for just BMI and one of the other traits (e.g. for just BMI & LDL)?

Note: the default priors used by COLOC are more lenient than those used by hyprcoloc in the two trait scenario.

@cnfoley any further thoughts here?

Best wishes,

James

ngbowker commented 4 years ago

Hi James,

Thanks for the help.

So having re-run the analysis as you asked with bb.alg =F and only one other trait, BMI does appear in my results. It seems that the regional probability is low = 0.0155 which might explain why no cluster was being identified between the two traits? As BMI was previously not being printed to the dropped traits column I hadn't seen this before.

Do you reckon this could be the source of the issue I'm having then? In addition, would you be able to explain the discrepancy in the results in getting between Hyprcoloc and COLOC in the same region using the same traits? Do you think this could be resultant of the more lenient default priors COLOC uses in the 2 trait scenario?

Thanks again for your help,

Nick

jrs95 commented 4 years ago

Hi Nick,

I think in this case I think we would need the data (just for BMI and the 2nd trait) to find out what is going on here. Would it be possible for you to share this data for one of the regions of interest? I'm not sure the result is the difference in priors here.

@cnfoley anything to add here?

Best wishes,

James

ngbowker commented 4 years ago

Hi James,

Sure, I understand. I've attached the relevant data for BMI and a protein of interest. I have been using default prior configurations and default thresholds for the regional and alignment parameters. In addition, I've been using variant-specific priors.

Thanks again for your help,

Nick bmi_rs1800437_locus1.txt res_invn_x16292_288_rs1800437_locus1.txt coloc_results.txt

jrs95 commented 4 years ago

Hi Nick,

I have had a look at the example above and I have got results that are very similar to your COLOC results:

Perhaps there is an issue with the way your processing your data prior to analysis? Attached is the code I used to perform the analysis.

Best wishes,

James

hyprcoloc_test.txt

ngbowker commented 4 years ago

Hi James,

Thanks for having a look at the example I sent through, I really appreciate it! I've run the analysis using your script and you're right, it does produce results that are very similar to COLOC. On that note, I've been through my code and I've identified an issue with processing the data prior to analysis which seems to have been the root of the problem.

Thanks once again for taking a look at this, you've been a huge help,

Nick