jrs95 / hyprcoloc

Hypothesis Prioritisation in multi-trait Colocalization
https://jrs95.github.io/hyprcoloc/
GNU General Public License v3.0
46 stars 12 forks source link

Harmonization + running time #10

Closed ghost closed 4 years ago

ghost commented 4 years ago

Hi,

Thanks you for sharing this amazing tool.

I'm wondering if it's necessary to provide betas from the same allele for all the traits. For a SNP rs123 with alleles A/C, do I need to stick with the betas from A allele for all the traits? Or I can use beta from A for some traits and beta for C for others? Does hyprcoloc package use beta/se just to infer p-value?

Also I have 2800 traits and it takes a very long time to run hyprcoloc. I have done tests and for 500 traits it takes 4.39 mins, 1000 traits: 1.23 hours, 1500 traits: 5.31 hours, 2000 traits: 15.03 hours and 2500 traits: 1.52 days. Is it normal?

Thanks again

jrs95 commented 4 years ago

Hi Nucholas,

Apologies for the delayed response. I hope this message finds you and your family well during these strange times.

Under the standard model (i.e. assuming all traits are independent), no allele alignment is necessary.

Wow you have a lot of traits! I think you win the prize for the most traits we have seen analysed in practice.

Yes, we would expect the performance to decrease the more traits are analysed, as the algorithm has more combinations of traits to consider. One thing you could do in each genetic region you analyse is remove any traits that do not have a strong association (e.g. p<1E-5) prior to performing colocalization analyses, as these traits are likely to be dropped anyway. @cnfoley any further thoughts here?

Best wishes,

James