cran / MendelianRandomization

:exclamation: This is a read-only mirror of the CRAN R package repository. MendelianRandomization — Mendelian Randomization Package
42 stars 26 forks source link

Possible bug in IVW/Egger with correlated variants #5

Open afschmidt opened 1 year ago

afschmidt commented 1 year ago

First let me express my gratitude for this amazing package.

Possibly there is a slight bug in the IVW/EGGER code for correlated variants.

For example in the Egger code https://github.com/cran/MendelianRandomization/blob/master/R/mr_egger-methods.R#L97

sigma <- solve(t(cbind(rep(1, nsnps), Bx))%*%solve(omega)%*%cbind(rep(1, nsnps), Bx))*max(sqrt(t(rse)%*%solve(omega)%*%rse/(nsnps-2)),1)
thetaEse <- sqrt(sigma[2,2])/min(1, rse.corr)
thetaInterse <- sqrt(sigma[1,1])/min(1, rse.corr)

The max(sqrt(t(rse)%*%solve(omega)%*%rse/(nsnps-2)),1) seems like it might possibly need to be sqrt(t(rse)%*%solve(omega)%*%rse/(nsnps-2)). This (as far as I understand) will make the min(1, rse.corr) in the 2 lines after that appropriately penalise the standard errors (so we get multiplicative random effect standard errors).

Many apologies if I am simply 100% wrong. Just wanted to flag this just in case it is helpful.

Best wishes,

Floriaan Schmidt

gaborcsardi commented 1 year ago

Hi, this is a read only mirror of CRAN, please see the package authors in the DESCRIPTION file. Look for Maintainer, BugReports and URL. Thanks!

sb452 commented 1 year ago

For IVW, I don't think there is a problem. If you run: mr_ivw(mr_input(ldlc[1:6], ldlcse[1:6], chdlodds[1:6], chdloddsse[1:6])) mr_ivw(mr_input(ldlc[1:6], ldlcse[1:6], chdlodds[1:6], chdloddsse[1:6]), model="fixed") mr_ivw(mr_input(ldlc[1:6], ldlcse[1:6], chdlodds[1:6], chdloddsse[1:6], cor=diag(rep(1,6)))) mr_ivw(mr_input(ldlc[1:6], ldlcse[1:6], chdlodds[1:6], chdloddsse[1:6], cor=diag(rep(1,6))), model="fixed") then you get the same answer each time. Similarly with: mr_ivw(mr_input(ldlc[1:14], ldlcse[1:14], chdlodds[1:14], chdloddsse[1:14])) mr_ivw(mr_input(ldlc[1:14], ldlcse[1:14], chdlodds[1:14], chdloddsse[1:14]), model="fixed") mr_ivw(mr_input(ldlc[1:14], ldlcse[1:14], chdlodds[1:14], chdloddsse[1:14], cor=diag(rep(1,14)))) mr_ivw(mr_input(ldlc[1:14], ldlcse[1:14], chdlodds[1:14], chdloddsse[1:14], cor=diag(rep(1,14))), model="fixed") In this case, there is heterogeneity, but still the answer is the same if the correlated variant or uncorrelated variant part of the code is used.

afschmidt commented 1 year ago

Thanks Steven,

My mistake for thinking this also affected the IVW code. Here is a code example for MR-Egger showing the difference between the correlated and uncorrelated parts of the code. I have also include some attempts to arrive at the SE of the uncorrelated code (which differs depending on the RSE being above or below 1).

> ### RSE below 1
> n = 6
> fit1 = mr_egger(mr_input(ldlc[1:n], ldlcse[1:n], chdlodds[1:n], chdloddsse[1:n]))
> fit2 = mr_egger(mr_input(ldlc[1:n], ldlcse[1:n], chdlodds[1:n], chdloddsse[1:n], cor=diag(rep(1,n))))
> fit1@StdError.Est
[1] 1.212094
> fit2@StdError.Est 
[1] 1.408568
> # correcting SE
> fit2@StdError.Est * fit2@RSE
[1] 1.212094

> ### RSE above 1
> fit1 = mr_egger(mr_input(ldlc, ldlcse, chdlodds, chdloddsse))
> fit2 = mr_egger(mr_input(ldlc, ldlcse, chdlodds, chdloddsse, cor=diag(rep(1,length(ldlc)))))
> fit1@StdError.Est
[1] 0.7701292
> fit2@StdError.Est
[1] 0.5535667
> # correcting SE
> fit2@StdError.Est * sqrt(fit2@RSE)
[1] 0.7701292

Hope this is more helpful than my original post. Again very happy to find out I am just wrong!

sb452 commented 1 year ago

Sorry, I got halfway through the response and got called away! Yes, it looks like this is an error in the mr_egger function with correlated variants. I think this is correct:

              sigma <- solve(t(cbind(rep(1, nsnps), Bx))%*%solve(omega)%*%cbind(rep(1, nsnps), Bx))*as.numeric(sqrt(t(rse)%*%solve(omega)%*%rse/(nsnps-2)))^2
              thetaEse <- sqrt(sigma[2,2])/min(1, rse.corr)
              thetaInterse <- sqrt(sigma[1,1])/min(1, rse.corr)

So we should multiply by as.numeric(sqrt(t(rse)%%solve(omega)%%rse/(nsnps-2))) whatever its value (and its value squared), but only divide by it if it is less than 1. I need to double check this when I'm less tired, but I think this is the resolution.

sb452 commented 1 year ago

Okay, it's the morning and I'm thinking more clearly. What I posted above works, but is convoluted - this is simpler:

              sigma <- solve(t(cbind(rep(1, nsnps), Bx))%*%solve(omega)%*%cbind(rep(1, nsnps), Bx))
              thetaEse <- sqrt(sigma[2,2])*max(1, rse.corr)
              thetaInterse <- sqrt(sigma[1,1])*max(1, rse.corr)

Will update the package in the next couple of weeks!

afschmidt commented 1 year ago

Thanks for looking into this. Very happy it turned out to be something useful after all!

Best wishes,

Floriaan.