andreyshabalin / MatrixEQTL

Matrix eQTL: Ultra fast eQTL analysis via large matrix operations
53 stars 16 forks source link

Your thoughts on some unusual cross-tissue results - possibly due to dummy covariates? #19

Closed danphillips28 closed 2 years ago

danphillips28 commented 2 years ago

Hi! Thanks for this great tool - It's super easy to use and I've had a fun time learning how to apply it on my data. I've run an eQTL analysis I'm pretty happy with on two tissues separately, lets call them tissue 1 and tissue 2. For the most part, most ~73% high-confidence eQTLs are common to both tissue 1 and tissue 2. I've plotted the results and they generally look very nice. Interestingly, a sizeable chunk of the remaining cis-eQTLs were specific to tissue 2. This was very exciting. However, when I've plotted those results in both tissues I am seeing that the expression patterning looks almost identical in both tissues, and isn't tissue 2-specific at all - however tissue 2 is nonetheless highly significant whereas tissue 1 for some reason was not significant at all. For example, the image below is what basically all the "tissue 2-specific" eQTLs look like;

Untitled presentation-1

Stats for tissue 1 are NS/NA because they were never saved in the eQTL output to begin with, due to not passing my specified p value threshold (p < 1e-8). My only guess is that the differential significance (despite very similar expression) is due to my covariates, most likely sequencing batch effect, due to the others (sex, BMI, age) being identical across tissues 1 and 2. Furthermore, I wonder if this is happening because I coded sequencing batch using dummy variables, which I've never done before. I just wanted to know if you think the differences in significance arise from the batch effect or if there is something else I ought to consider.

Thanks again! D

andreyshabalin commented 2 years ago

Hi Dan,

You may want to run Matrix eQTL with less strict p-value threshold to assess the significance of these eQTLs in tissue 1.

Alternatively, you can rerun Matrix eQTL with different set to covariates to see the effect they have on the number of discoveries.

I would also suggest inspecting QQ-plots for the signs of systematic inflation of test statistics in Tissue 1 and Tissue 2.

Let's continue this discussion via email, as it's not about an issue with the package. My address is: andrey.shabalin@gmail.com

Andrey