JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
172 stars 55 forks source link

Mendelian randomization with MTAG instruments #208

Open gitgenes opened 8 months ago

gitgenes commented 8 months ago

Assuming proper separation of samples so that no sample information is shared between exposure and outcome instruments, is there any intrinsic problem with using MTAG output summary statistics as instruments for a Mendelian randomization analysis?

paturley commented 8 months ago

I'd be a little nervous about using MTAG results in an MR setting, but I'm already pretty nervous about doing MR generally. The reason why you might be additionally nervous in the case of MTAG results is that MTAG estimates are shaded towards the effects sizes of the secondary phenotypes that are used in MR. This error is captured by the standard errors normally, so it doesn't inflate the type-I error rate, but you might worry about it in a setting like MR where you are interested in how similar two phenotypes are.

For example, let's say that you want to do an MR for the effect of Z on Y. Because Z and Y are genetically correlated, you first run MTAG on the two sets of summary statistics to boost power. The precision of the summary statistics will go up, but the correlation of the error in the summary statistics will also go up. So when you take the ratio of the summary statistics (or whatever other flavor of MR you are using), the correlation may be inflated (and therefore the MR estimate may also be inflated).

That said, I haven't tested this out. So if you run simulations and find that this problem is negligible, then maybe you are OK.

On Wed, Mar 20, 2024 at 1:19 PM gitgenes @.***> wrote:

Assuming proper separation of samples so that no sample information is shared between exposure and outcome instruments, is there any intrinsic problem with using MTAG output summary statistics as instruments for a Mendelian randomization analysis?

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/208, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5MQB5HGVF3EYRFT4C3YZHAKXAVCNFSM6AAAAABE746YJ6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGE4TQMJQGA2TEOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>

gitgenes commented 8 months ago

Thanks for your quick reply. I tried to say this, but was not clear about it, that I am intentionally disavowing some sort of game where the exposure and outcome of an MR would be MTAG'ed together with one another.

For a more concrete example to try to convey my intended example, imagine doing MTAG on LDL and HDL GWASes which were done in a partially overlapping population, and then using those MTAG'ed summary statistics as exposure instruments. The outcome instruments will be CAD from a different population (no MTAG involved in the CAD statistics).

So, the risk in this example a bit more subtle and hopefully less flagrant. Here, I am concerned that I could be imbuing HDL (which is normally null for CAD) with some LDL qualities, which might induce an apparent HDL-CAD association in MR.

To your point, I realize I could just run this and explore the properties. But I figured worth some sort of discussion.

paturley commented 8 months ago

Yeah, I gave the extreme case to point out the type of problems that can arise. Like you said in your example, if you did an MTAG of HDL and LDL, I'd worry about some cross contamination of the two sets of summary statistics. So the MTAG results may be falsely related to CAD when they wouldn't be in the GWAS results if any of the MTAG inputs are associated with CAD.

On Wed, Mar 20, 2024 at 3:28 PM gitgenes @.***> wrote:

Thanks for your quick reply. I tried to say this, but was not clear about it, that I am intentionally disavowing some sort of game where the exposure and outcome of an MR would be MTAG'ed together with one another.

For a more concrete example to try to convey my intended example, imagine doing MTAG on LDL and HDL GWASes which were done in a partially overlapping population, and then using those MTAG'ed summary statistics as exposure instruments. The outcome instruments will be CAD from a different population (no MTAG involved in the CAD statistics).

So, the risk in this example a bit more subtle and hopefully less flagrant. Here, I am concerned that I could be imbuing HDL (which is normally null for CAD) with some LDL qualities, which might induce an apparent HDL-CAD association in MR.

To your point, I realize I could just run this and explore the properties. But I figured worth some sort of discussion.

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/208#issuecomment-2010456771, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5IOPQAYZWGSEPN455DYZHPOZAVCNFSM6AAAAABE746YJ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJQGQ2TMNZXGE . You are receiving this because you commented.Message ID: @.***>