jrs95 / hyprcoloc

Hypothesis Prioritisation in multi-trait Colocalization
https://jrs95.github.io/hyprcoloc/
GNU General Public License v3.0
46 stars 12 forks source link

Error when running hyprocoloc #1

Closed xtmgah closed 4 years ago

xtmgah commented 5 years ago

Hello, Very interesting algorithm to perform the colocalization analysis. I try to run this but have some error in my dataset (attached). Can you help to check? Thanks.

load('test.RData') res <- hyprcoloc(betas, ses, trait.names=traits, snp.id=rsid); Error in if (reg.prob <= reg.thresh) { : missing value where TRUE/FALSE needed test.RData.zip

jrs95 commented 5 years ago

Thanks for the query. You have a zero value in your standard error matrix, which is causing the issue. I have added a more informative error flag.

xtmgah commented 5 years ago

Thanks. yes. That caused by the zero value. Another question, should I always set the LD matrix or the binary outcomes according to my dataset? For example, the test dataset is a coloclaization analysis between GWAS and eQTL. I guess I have to set the binary outcomes as c(1,0) also need to set up the LD matrix since there are few LD blocks in this locus?, what's else parameter should I pay attention to? Thanks

jrs95 commented 5 years ago

You're right that the binary outcomes variable should be set to c(1,0) if your GWAS trait is binary. The LD information is not required in your scenario as the traits are likely to be from non-overlapping samples.

Zepeng-Mu commented 4 years ago

You're right that the binary outcomes variable should be set to c(1,0) if your GWAS trait is binary. The LD information is not required in your scenario as the traits are likely to be from non-overlapping samples.

Hi, I got a little bit confused here. You say that with non-overlapping samples between GWAS and QTL, LD matrix is not necessary. But by default, it is an identity matrix, and we know that SNPs are definitely not independent. So can you please explain why having non-overlapping samples makes it OK to have an identity LD matrix.

Thanks a lot!

jrs95 commented 4 years ago

Hi,

As the model assumes there is at most one causal variant per phenotype, the way the model is set up there is no need to consider LD between variants. This follows from the Giambartolomei coloc method (PMID: 24830394), on which HyPrColoc is based.

The LD matrix is only required in HyPrColoc to modify the variant priors when the traits are from non-overlapping samples in conjunction with a phenotype correlation matrix (there is a complex argument as to why this is necessary, which @cnfoley can give you). Although, in simulations treating phenotypes as if they were from independent samples, even if they were not, often out-performed trying to account for the possible correlation between phenotypes caused by analysing the phenotypes in the same participants. So, our general advice is to just run the standard model, and to not worry about global phenotype correlation caused through analysis of overlapping samples.

I hope this answers your question.

Best wishes,

James