JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
169 stars 54 forks source link

Results with --use_beta_se parameter differ from results without #74

Open cmshuang opened 5 years ago

cmshuang commented 5 years ago

Hi,

We have GWAS summary statistics for armfat percent, trunkfat percent, and waistc with the following columns (reformatted to fit MTAG defaults):

chr bpos a1 TEST n beta se L95 U95 z pval a2 freq snpid

We ran MTAG once without the use_beta_se parameter, then tried running it with use_beta_se with the files as is, but ran into the issue addressed by #57, so removed the TEST, L95, U95, and z columns before running use_beta_se again. The reported omega estimates for both are as follows:

image

We graphed the MTAG betas against the original GWAS betas for both runs: image

We also graphed the betas generated by line 801 (before the actual MTAG calculation) in the code against the original GWAS betas for both runs:

image

Why is there a discrepancy between the generated GWAS betas and the original GWAS betas for both runs, and between the results of the two runs?

Thank you!

paturley commented 5 years ago

Hi!

Sorry for the delayed response. A couple quick questions.

1) Are you running MTAG on one trait at a time here? I would have thought that the Omega values you report would be 3x3 matrices, but since they are scalars, maybe I'm misunderstanding what you are doing.

2) Is the phenotype standardized before you conduct the GWAS. The Z-N option in MTAG will produce estimates that correspond to a standardized phenotype. If the phenotype wasn't standardized, then the betas will be off by a scalar multiple corresponding to the standard deviation of the phenotype. Perhaps that resolves your question.

Best, Patrick

On Wed, Jul 31, 2019 at 2:12 PM cmshuang notifications@github.com wrote:

Hi,

We have GWAS summary statistics for armfat percent, trunkfat percent, and waistc with the following columns (reformatted to fit MTAG defaults):

chr bpos a1 TEST n beta se L95 U95 z pval a2 freq snpid

We ran MTAG once without the use_beta_se parameter, then tried running it with use_beta_se with the files as is, but ran into the issue addressed by

57 https://github.com/omeed-maghzian/mtag/issues/57, so removed the

TEST, L95, U95, and z columns before running use_beta_se again. The reported omega estimates for both are as follows:

[image: image] https://user-images.githubusercontent.com/49325084/62171395-acec5300-b2e3-11e9-9917-6d6f0656a39b.png

We graphed the MTAG betas against the original GWAS betas for both runs: [image: image] https://user-images.githubusercontent.com/49325084/62173278-4cacdf80-b2ea-11e9-8dfd-a615e2caff02.png

We also graphed the betas generated by line 801 (before the actual MTAG calculation) in the code against the original GWAS betas for both runs:

[image: image] https://user-images.githubusercontent.com/49325084/62236447-f7bda780-b383-11e9-8f89-9606e297eb26.png

Why is there a discrepancy between the generated GWAS betas and the original GWAS betas for both runs, and between the results of the two runs?

Thank you!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/74?email_source=notifications&email_token=AFBUB5L44C74PDBRARBWROLQCHI2BA5CNFSM4IIJI3VKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HCUJVQA, or mute the thread https://github.com/notifications/unsubscribe-auth/AFBUB5MKICX2EPKSIWA4YOTQCHI2BANCNFSM4IIJI3VA .

cmshuang commented 5 years ago

Hi Patrick,

Thank you very much for your response! We ran MTAG on all three traits at the same time--the table is actually two 3x3 matrices, just with columns and rows labelled with the traits. Sorry for the confusion!

The GWAS betas are for phenotypes standardized to mean 0 and unit variance (run using Plink's --linear and --standard-beta option).

Thanks again!

paturley commented 5 years ago

Does --standard-beta also give results based on standardizing the genotype, or just the phenotype?

On Fri, Aug 2, 2019 at 4:19 PM cmshuang notifications@github.com wrote:

Hi Patrick,

Thank you very much for your response! We ran MTAG on all three traits at the same time--the table is actually two 3x3 matrices, just with columns and rows labelled with the traits. Sorry for the confusion!

The GWAS betas are for phenotypes standardized to mean 0 and unit variance (run using Plink's --linear and --standard-beta option).

Thanks again!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/74?email_source=notifications&email_token=AFBUB5OJUZAVAA3ADEWJWBLQCSJGDA5CNFSM4IIJI3VKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3OXZZQ#issuecomment-517831910, or mute the thread https://github.com/notifications/unsubscribe-auth/AFBUB5K54Z5T4AXTT4OQX3DQCSJGDANCNFSM4IIJI3VA .

cmshuang commented 5 years ago

The results are given based on both standardizing the genotype and the phenotype.

paturley commented 5 years ago

Ah OK. I believe that MTAG assumes that the GWAS results are based on standardized phenotypes but unstandardized genotypes. What happens when you first transform your summary statistics back to allele-count units before you pass them into MTAG?

On Mon, Aug 5, 2019 at 2:32 PM cmshuang notifications@github.com wrote:

The results are given based on both standardizing the genotype and the phenotype.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/74?email_source=notifications&email_token=AFBUB5PY2K6UNT3EMDBMPRLQDBW4NA5CNFSM4IIJI3VKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3SV2DY#issuecomment-518348047, or mute the thread https://github.com/notifications/unsubscribe-auth/AFBUB5PJNZ6NM2W4RNUVUWLQDBW4NANCNFSM4IIJI3VA .

cmshuang commented 5 years ago

Hi Patrick,

Could you clarify what you mean? The MTAG paper (p. 230) states that

In the framework that follows, all traits and genotypes are standardized to have mean zero and variance one.

Thank you!

paturley commented 5 years ago

It's true that for the theory in the paper, we made that assumption to simplify the algebra. But we wrote the code to correspond to standard GWAS summary statistics, where the genotypes have not been standardized. Sorry for the confusion.

On Thu, Aug 8, 2019 at 1:55 PM cmshuang notifications@github.com wrote:

Hi Patrick,

Could you clarify what you mean? The MTAG paper (p. 230) states that

In the framework that follows, all traits and genotypes are standardized to have mean zero and variance one.

Thank you!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/74?email_source=notifications&email_token=AFBUB5JROG4WUFUH2EVSKZLQDRMYBA5CNFSM4IIJI3VKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD34NF2Q#issuecomment-519623402, or mute the thread https://github.com/notifications/unsubscribe-auth/AFBUB5P4RSWG76JO46QS5QDQDRMYBANCNFSM4IIJI3VA .

cmshuang commented 5 years ago

Understood, thank you very much for the clarification! Will get back to you with the results.

cmshuang commented 5 years ago

Hi Patrick

We ran the GWAS using plink 2.0, standardizing only the phenotypes and the covariates, then ran MTAG again.

We graphed the MTAG beta against the original GWAS betas (from the newly generated summary statistics):

image

As well as the the betas generated by line 801 (before the actual MTAG calculation) in the code against the original GWAS betas (from the newly generated summary statistics):

image

It appears that there is still a discrepancy between the generated GWAS betas and the original GWAS betas for both runs and between the results.

cmshuang commented 5 years ago

Some other possibly interesting results:

For the use_beta_se option, the betas generated by line 801 in the code match the original GWAS betas with the standardized phenotypes and genotypes, but this is not the case for the default option.

image

For the default option, the MTAG results for the summary statistics with unstandardized genotypes match the MTAG results for the summary statistics with standardized genotypes. but this is not the case for the use_beta_se option.

image

The MTAG results for the summary statistics with unstandardized genotypes run with the use_beta_se option also do not match the MTAG results for the default option (both unstandardized genotypes and standardized, as they are the same).

image

paturley commented 5 years ago

I'm having a hard time following what the difference set of summary stat results are.

Can you define the axes for me? E.g., what's the difference between Original GWAS betas and Estimated GWAS betas.

On Tue, Aug 13, 2019 at 12:44 PM cmshuang notifications@github.com wrote:

Some other possibly interesting results:

For the use_beta_se option, the betas generated by line 801 in the code match the original GWAS betas with the standardized phenotypes and genotypes, but this is not the case for the default option.

[image: image] https://user-images.githubusercontent.com/49325084/62958448-90ecb500-bdab-11e9-8889-8afb58973fb2.png

For the default option, the MTAG results for the summary statistics with unstandardized genotypes match the MTAG results for the summary statistics with standardized genotypes. but this is not the case for the use_beta_se option.

[image: image] https://user-images.githubusercontent.com/49325084/62959979-8b449e80-bdae-11e9-97d4-f8d5e107628f.png

The MTAG results for the summary statistics with unstandardized genotypes run with the use_beta_se option also do not match the MTAG results for the default option (both unstandardized genotypes and standardized, as they are the same).

[image: image] https://user-images.githubusercontent.com/49325084/62960114-d959a200-bdae-11e9-8621-8dac76dfea0f.png

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/74?email_source=notifications&email_token=AFBUB5NPZOWJXW3FL34PLL3QELQIFA5CNFSM4IIJI3VKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4GIG2Q#issuecomment-520913770, or mute the thread https://github.com/notifications/unsubscribe-auth/AFBUB5PWEOTMQCU4YP6KDE3QELQIFANCNFSM4IIJI3VA .

cmshuang commented 5 years ago

The original GWAS betas are the betas directly from the summary statistics. Estimated GWAS betas are the betas calculated by line 801 in the code to be used in the MTAG calculation (per our understanding of the code, at least). MTAG betas are the resulting betas of the MTAG calculation.

Please let me know if you need any further clarifications!

paturley commented 5 years ago

Do you standardize the phenotype before or after residualizing for covariates?

On Tue, Aug 13, 2019 at 2:04 PM cmshuang notifications@github.com wrote:

The original GWAS betas are the betas directly from the summary statistics. Estimated GWAS betas are the betas calculated by line 801 in the code to be used in the MTAG calculation. MTAG betas are the resulting betas of the MTAG calculation.

Please let me know if you need any further clarifications!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/74?email_source=notifications&email_token=AFBUB5M6OWGVLNNYE4B2MALQELZRNA5CNFSM4IIJI3VKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4GPLAI#issuecomment-520942977, or mute the thread https://github.com/notifications/unsubscribe-auth/AFBUB5KWHZBG4KLDIVKRM3LQELZRNANCNFSM4IIJI3VA .

cmshuang commented 5 years ago

Hi Patrick,

So sorry for the delayed response--from looking at the plink 2.0 order of operations, it seems that the phenotypes are standardized before residualizing for covariates: http://www.cog-genomics.org/plink/2.0/order

Thanks!

paturley commented 5 years ago

I think that is it. The Z-N version of MTAG returns results that correspond to phenotypes that have been standardized after being residualized. Can you verify that your results look sensible if you residualize first?

On Tue, Aug 20, 2019, 6:25 PM cmshuang notifications@github.com wrote:

Hi Patrick,

So sorry for the delayed response--from looking at the plink 2.0 order of operations, it seems that the phenotypes are standardized before residualizing for covariates: http://www.cog-genomics.org/plink/2.0/order

Thanks!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/74?email_source=notifications&email_token=AFBUB5JAAAG5VRCDMR7YNJLQFRVPLA5CNFSM4IIJI3VKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4X274A#issuecomment-523218928, or mute the thread https://github.com/notifications/unsubscribe-auth/AFBUB5P4VOHM64WKVO6KKWDQFRVPLANCNFSM4IIJI3VA .