Evaluation of false positives

AlexSimis commented 2 months ago

Dear authors,

I conducted a cis - pQTL mapping on 1000 protein abundance values in four contexts ( two groups x two time-points ) resulting in ~ 4.3 mil cis tests for each context. I ran mashr following your "eQTL analysis outline" vignette and discovered that the pairwise sharing between contexts is on average ~ 98%.

Given that this is quite high, is it worth exploring the remaining 2% of pQTLs that is reported as context-specific?

Do you have any suggestions on how to evaluate if these context-specific are not false positives?

For example, I'm thinking of running multiple times the mash model with the same parameters but changing the "seed" number to select a different random subset to train the model and check if I get the same context-specific pQTLs (but this is computationally expensive).

Thanks, Alex

pcarbo commented 2 months ago

@AlexSimis A quantity that might be more robust to differences in model fits due to different initializations is sharing by magnitude (e.g., Fig. 5 of the mash paper in Nat. Genet.). For example, if the effects in one context are very large and very small (or zero) in the other context, this result is likely to be more robust than small effects in one context and no effects in the other context. The concern with zero vs. non-zero effects is that there may be many "borderline" situations that may be affected by small differences in model fit.

AlexSimis commented 2 months ago

@pcarbo That's clear!

Thanks

stephenslab / mashr

Evaluation of false positives #124