JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
164 stars 50 forks source link

Questions about Second stage MTAG and baseline FDR #211

Open test12138jooh opened 1 month ago

test12138jooh commented 1 month ago

The article I came across that utilized MTAG left me feeling very puzzled about whether I used the MTAG is correct.

  1. As per my understanding, if I were to employ MTAG to simultaneously input four datasets for multi-phenotype analysis, I would still receive separate association results for each of the four phenotypes. There's no priority order, and the primary phenotype is simply set based on my focus; there's no need to specify it when running MTAG, am I correct?
  2. However, this particular article, in the first stage of MTAG analysis, separately analyzed three phenotypes, each including four datasets, all of which were the same phenotype. Subsequently, these three phenotypes were incorporated into the second stage of MTAG analysis. Consequently, it appears that in the first stage, each phenotype seemingly obtained only one final MTAG association result. This feels more akin to the results of a meta-analysis, as when the phenotypes are the same, you cannot designate a primary phenotype, correct?
  3. Annother question is about the baseline maxFDR for MTAG; If I want to achieve this value, I should input each phenotype individually to run MTAG and obtain baseline maxFDR for all phenotypes. Am I correct?

The second question perhaps I should direct to the authors of the article, but if you have any suggestions, I would greatly appreciate them as well.

In the first stage, for POAG MTAG analysis, we included datasets from: (1) 15,229 POAG cases and 177,473 controls of European descent excluding UKB samples; (2) 11,239 glaucoma cases and 137,621 controls of European descent in the UKB; (3) 1,358 glaucoma cases and 16,455 controls of European descent in the CLSA; (4) Mass General Brigham Biobank with 1,415 glaucoma cases and 18,632 controls.

Similarly, for VCDR, we ran MTAG analysis using data from: (1) 68,240 participants with VCDR (adjusted for vertical disc diameter) in the UKB of European descent; (2) 18,304 participants with VCDR (adjusted for vertical disc diameter) in the CLSA of European descent; (3) 25,180 participants with VCDR from IGGC of European descent.

In the second stage, the trait-specific MTAG outputs from the first stage were further included in MTAG analysis. One key advantage of this two-stage MTAG design was reduced computational burden compared with running MTAG analysis including all GWAS summary statistics for POAG, VCDR and IOP in a single job.

paturley commented 1 month ago

Hello,

You are right that at baseline MTAG treats all phenotypes symmetrically and produces summary statistics for each phenotype. There is an option in MTAG that allows you to force the genetic correlation to be one and/or the heritability to be the same across phenotypes. If you assume both, the summary statistics for all phenotypes will be identical. If you just assume perfect genetic correlation, the summary statistics will be constant multiples of each other. So using any of the summary statistics in a subsequent MTAG would be equivalent in terms of the final p values.

I don't know if that is what they did in this case though. You'd need to reach out to those authors for details on what they did in their paper.

On Wed, May 8, 2024 at 6:12 AM test12138jooh @.***> wrote:

The article I came across that utilized MTAG left me feeling very puzzled about whether I used the MTAG is correct.

  1. As per my understanding, if I were to employ MTAG to simultaneously input four datasets for multi-phenotype analysis, I would still receive separate association results for each of the four phenotypes. There's no priority order, and the primary phenotype is simply set based on my focus; there's no need to specify it when running MTAG, am I correct?
  2. However, this particular article, in the first stage of MTAG analysis, separately analyzed three phenotypes, each including four datasets, all of which were the same phenotype. Subsequently, these three phenotypes were incorporated into the second stage of MTAG analysis. Consequently, it appears that in the first stage, each phenotype seemingly obtained only one final MTAG association result. This feels more akin to the results of a meta-analysis, as when the phenotypes are the same, you cannot designate a primary phenotype, correct?

The second question perhaps I should direct to the authors of the article, but if you have any suggestions, I would greatly appreciate them as well.

In the first stage, for POAG MTAG analysis, we included datasets from: (1) 15,229 POAG cases and 177,473 controls of European descent excluding UKB samples; (2) 11,239 glaucoma cases and 137,621 controls of European descent in the UKB; (3) 1,358 glaucoma cases and 16,455 controls of European descent in the CLSA; (4) Mass General Brigham Biobank with 1,415 glaucoma cases and 18,632 controls.

Similarly, for VCDR, we ran MTAG analysis using data from: (1) 68,240 participants with VCDR (adjusted for vertical disc diameter) in the UKB of European descent; (2) 18,304 participants with VCDR (adjusted for vertical disc diameter) in the CLSA of European descent; (3) 25,180 participants with VCDR from IGGC of European descent.

In the second stage, the trait-specific MTAG outputs from the first stage were further included in MTAG analysis. One key advantage of this two-stage MTAG design was reduced computational burden compared with running MTAG analysis including all GWAS summary statistics for POAG, VCDR and IOP in a single job.

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/211, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5IQJ3O7ECD7YON2CYLZBH3BFAVCNFSM6AAAAABHMVEVOSVHI2DSMVQWIX3LMV43ASLTON2WKOZSGI4DKMRSGE2DKOI . You are receiving this because you are subscribed to this thread.Message ID: @.***>