Running mash on non centered z-scores

william-denault commented 3 years ago

It is more a question than an issue. I am running mash on the output of two Epigenome-wide association studies (EWAS), where I study the effect of a treatment on DNAm level in the mother (EWAS1) and in the fetus (EWAS2). It is expected that the distribution of the effects of the treatment on DNA methylation is not centered. The treatment generally increases the DNAm in the mother and decreases it in the fetus.

I am aiming at finding CpGs that are affected by the treatment in the mother and in the fetus. By using a simple bivariate plot it seems clear that there is some shared effect. I have colored the dots whether or not they have an lfsr below 0.05. I get the shared effect using get_pairwise_sharing(res.mash, factor = 0.5,lfsr_thresh = 0.05) function and the dots are display in blue if they are part of the output,t otherwise in green if res.mash$results$lfsr is below 0.05 for the maternal column and similarly in red for the children. Shared_effect_strange However, my data are truly not centered, and it seems that mash found that some of the points that would be expected to be shared as being specific only (see red arrow). density

I know that mash assumes centered zscores. Is there any easy fix to use mash in such a setting, or am I using the wrong tool for such analysis?

stephens999 commented 3 years ago

well, you could center the z scores in each group (subtract the mean) before applying mashr. Then you would be getting mashr to estimate the difference in the effect from the average effect.

However, I am not convinced the results will change much, because it does not look like the means are very different from 0 on the scale of things...

More worrying to me are the green points you highlight that lie near y=x but are not being called as shared.... i would have expected them to be shared by our "within a factor of 0.5" criterion if you are analyzing these z scores. I guess this might be different if you are analyzing the bhat and standard error - which are you doing (zscores, or (bhat,s) ?)

william-denault commented 3 years ago

Hello,

I checked my code, I actually used beta and standard error.

My script here

beta <- cbind(dfadj$T1_minus_T0,dfmadj$T1_minus_T0) shat <- cbind(dfadj$sd, dfmadj$sd) data = mash_set_data(beta,shat, alpha=1) U.c = cov_canonical(data) res.mash = mash(data,U.c)

I will rerun it with centered zscores and let you you know if I see this phenomenon again.

stephens999 commented 3 years ago

so I guess it is important to emphasize that the green points are those that mashr believes to have a much stronger effect (beta) in mother than child, and not those that are "mother specific".

william-denault commented 3 years ago

Thank you very much, for helping me to interpret the output.

stephenslab / mashr

Running mash on non centered z-scores #89