JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
167 stars 54 forks source link

mtag beta estimates smaller than that in individual gwas? #101

Closed Xuemin-Wang closed 3 years ago

Xuemin-Wang commented 4 years ago

Dear authors and users,

Do you expect the beta output by MTAG to be smaller than that of the original gwas?

beta_mtag / beta_gwas = Z_mtag/sqrt(N_mtag var(SNP)) / (Z_gwas/sqrt(N_gwas var(SNP))) = Z_mtag/Z_gwas * sqrt(N_gwas/N_mtag)

From the original mtag paper, Z_mtag and Z_gwas were perfect fit (R2=0.996 for DEP and 0.999 for height). It looks like beta estimates by mtag will be smaller than that in the individual gwas because of the increase of the gwas-equivalent sample size for MTAG (e.g. from 354,861 to 449,649 for DEP). As more genetically correlated traits were included from non-overlapping samples of the same ancestry, one would expect the decrease of beta in mtag will be more substantial if Z_mtag and Z_gwas are still near perfect fit. However, Z_mtag will be increased as well.

I conducted mtag including 19 genetically correlated traits. Z_mtag is generally 2-5 times of Z_gwas (see plot 1; only results of snps with pval < 5e-8 displayed), but beta_mtag is about one order of magnitude smaller than beta_gwas (plot 2). Is this something expected by mtag?

Many thanks, patrick

image image

Xuemin-Wang commented 4 years ago

Just saw the issues https://github.com/JonJala/mtag/issues/24#issuecomment-379473026 and https://github.com/JonJala/mtag/issues/10. The effective N solution might solve mine as well.

paturley commented 4 years ago

Hi Patrick,

I'm glad you seem to have been able to sort this out. One other comment. The correlation of .999 for height and of .996 for DEP was based on an analysis where we compared the Z-scores of a GWAS of the full UKB to a MTAG results combining three partially-overlapping GWAS in the UKB. The union of the data from the MTAG analysis was equal to the data used in the full GWAS. It sounds like you are describing adding additional data with each cohort in the example you are describing, which would theoretically increase the magnitude of the z-statistics (if MTAG's assumptions hold).

Best, (other) Patrick

On Wed, Aug 5, 2020 at 2:15 AM Xuemin Wang notifications@github.com wrote:

Just saw the issues #24 (comment) https://github.com/JonJala/mtag/issues/24#issuecomment-379473026 and #10 https://github.com/JonJala/mtag/issues/10. The effective N solution might solve mine as well.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/101#issuecomment-669002399, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5PHDJKFR2WYA3LNLYDR7D2G5ANCNFSM4PVCJAZQ .

Xuemin-Wang commented 4 years ago

Hi (other) Patrick,

Thanks very much for your comments.

I did run mtag with diverse traits of different cohorts. After using the effective N for binary traits, the size of z increased, e.g. from (-5, 5) in gwas to (-10, 10) in mtag of 16 traits (first graph). z_comparison_EEC_mtag16_sex_stratified However, the range and size of mtag beta are still smaller than gwas beta, e.g. (-0.2, 0.15) in mtag vs. (-0.4, 0.4) in gwas as shown in the 2nd graph. The regression coefficient and r2 were low. Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.136e-05 6.129e-06 -5.117 3.11e-07 mtag_ecac$gwas_beta 2.488e-01 1.680e-04 1480.793 < 2e-16 beta_comparison_EEC_mtag16_sex_stratified

Is this something expected with this many traits? Or do you think it's likely caused by the violation of mtag's assumptions with this many traits?

Many thanks, patrick

paturley commented 4 years ago

Hi Patrick,

The units of MTAG coefficients should be the same as if the phenotype had been standardized prior to the GWAS, so if your outcome variable is something that has variance greater than one, then you'd expect the magnitude of the effect sizes to be smaller after MTAG. It's also possible that if the genetic correlation between your traits is small, there will be slight attenuation of effect sizes corresponding to SNPs that are associated with some phenotypes but not others.

On Sun, Aug 9, 2020 at 10:40 PM Xuemin Wang notifications@github.com wrote:

Hi (other) Patrick,

Thanks very much for your comments.

I did run mtag with diverse traits of different cohorts. After using the effective N for binary traits, the size of z increased, e.g. from (-5, 5) in gwas to (-10, 10) in mtag of 16 traits (first graph). [image: z_comparison_EEC_mtag16_sex_stratified] https://user-images.githubusercontent.com/23228449/89748324-f336dd80-db05-11ea-94d2-75ef67dd3247.png However, the range and size of mtag beta are still smaller than gwas beta, e.g. (-0.2, 0.15) in mtag vs. (-0.4, 0.4) in gwas as shown in the 2nd graph. The regression coefficient and r2 were low. Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -3.136e-05 6.129e-06 -5.117 3.11e-07 mtag_ecac$gwas_beta 2.488e-01 1.680e-04 1480.793 < 2e-16 [image: beta_comparison_EEC_mtag16_sex_stratified] https://user-images.githubusercontent.com/23228449/89748158-2e84dc80-db05-11ea-8bee-74395bd815aa.png

Is this something expected with this many traits? Or do you think it's likely caused by the violation of mtag's assumptions with this many traits?

Many thanks, patrick

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/101#issuecomment-671139488, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5LLJSANUEEB4YP5VATR75MZTANCNFSM4PVCJAZQ .

Xuemin-Wang commented 4 years ago

Hi Patrick,

Thanks very much for your comments.

I have to read the individual articles/analyses again to double check if the phenotypes were standardised. What transformation of the gwas sumstats would you recommend to perform before MTAG if some phenotypes were not standardised? The absolute values of genetic correlations between traits varied from 0.1 to 0.5.

beta of lead variants after mtag also shrinked as shown in the graph image

Many thanks,

paturley commented 4 years ago

What happens to the betas if you pass a single GWAS into MTAG (rather than all of them at once)?

On Mon, Aug 10, 2020 at 10:03 AM Xuemin Wang notifications@github.com wrote:

Hi Patrick,

Thanks very much for your comments.

I have to read the individual articles/analyses again to double check if the phenotypes were standardised. What transformation of the gwas sumstats would you recommend to perform before MTAG if some phenotypes were not standardised? The absolute values of genetic correlations between traits varied from 0.1 to 0.5.

beta of lead variants after mtag also shrinked as shown in the graph [image: image] https://user-images.githubusercontent.com/23228449/89790999-12655780-db66-11ea-9b31-dcaa8e19966f.png

Many thanks,

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/101#issuecomment-671374151, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5NZNKKC5PZEMCL2HYDR7744XANCNFSM4PVCJAZQ .

Xuemin-Wang commented 4 years ago

It looks okay for two traits as shown here image

paturley commented 4 years ago

If the beta coefficients are the same for a small number of traits but there is substantial attenuation for many traits, then the standardization can't be the issue. My best guess is that you are getting into the range of MTAG where miscalibration of the Omega matrix can start to cause you problems. How many traits are you including in the analyses that you are worried about?

On Mon, Aug 10, 2020 at 10:13 AM Xuemin Wang notifications@github.com wrote:

It looks okay for two traits as shown here [image: image] https://user-images.githubusercontent.com/23228449/89792051-66247080-db67-11ea-8a19-083686e2eb7b.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/101#issuecomment-671380140, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5O4Z6OWAQ5MQH6G3C3R776A3ANCNFSM4PVCJAZQ .

Xuemin-Wang commented 4 years ago

So did you mean the inclusion of other traits would not shrink beta of the trait of interest (e.g. trait A) after MTAG if A was standardised but some other were not? I didn't run iterations of mtag with one more trait added each time to check the inclusion of how many traits will cause the issue. Of the traits included, some are negatively correlated.

paturley commented 4 years ago

Because MTAG works with the Z-statistics instead of directly with the betas, it shouldn't matter whether the phenotypes had been standardized or not. It also shouldn't matter if some traits are negatively correlated as long as MTAG's assumptions are not violated. The more traits you add though, the more likely it is that the assumptions will be violated. It's also possible that MTAG has some undesirable properties when the number of traits gets too high. We didn't stress test the software at extreme levels.

On Mon, Aug 10, 2020 at 10:34 AM Xuemin Wang notifications@github.com wrote:

So did you mean the inclusion of other traits would not shrink beta of the trait of interest (e.g. trait A) after MTAG if A was standardised but some other were not? I didn't run iterations of mtag with one more trait added each time to check the inclusion of how many traits will cause the issue. Of the traits included, some are negatively correlated.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/101#issuecomment-671393035, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5KFXIJOCBO5VI6MW4LSAAAP5ANCNFSM4PVCJAZQ .