JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
169 stars 54 forks source link

Err during "Estimating sigma.." #209

Open test12138jooh opened 4 months ago

test12138jooh commented 4 months ago

Dear professor, I came across the problem when runing 10 trait, and the err log file was shown as below. I also tried to exlcude all variants containing "AGAAGA" genotype, it stiil gave the same erro output. I will bed truly appreciated fot your help.

2024/04/22/10:36:59 PM Estimating sigma..
2024/04/22/10:37:09 PM 'AGAAGA'
Traceback (most recent call last):
  File "/local/StandTools/mtag/mtag.py", line 1577, in <module>
    mtag(args)
  File "/local/StandTools/mtag/mtag.py", line 1358, in mtag
    args.sigma_hat = estimate_sigma(DATA[not_SA], args)
  File "/local/StandTools/mtag/mtag.py", line 472, in estimate_sigma
    rg_results =  sumstats_sig.estimate_rg(args_ldsc_rg, Logger_to_Logging())
  File "/local/StandTools/mtag/ldsc_mod/ldscore/sumstats.py", line 442, in estimate_rg
    loop = _read_other_sumstats(args, log, None, sumstats, ref_ld_cnames,sumstats2=p2)
  File "/local/StandTools/mtag/ldsc_mod/ldscore/sumstats.py", line 494, in _read_other_sumstats
    loop['Z2'] = _align_alleles(loop.Z2, alleles)
  File "/local/StandTools/mtag/ldsc_mod/ldscore/sumstats.py", line 567, in _align_alleles
    z *= (-1) ** alleles.apply(lambda y: FLIP_ALLELES[y])
  File "/local/anaconda3/envs/zj/envs/mtag/lib/python2.7/site-packages/pandas/core/series.py", line 3591, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas/_libs/lib.pyx", line 2217, in pandas._libs.lib.map_infer
  File "/local/StandTools/mtag/ldsc_mod/ldscore/sumstats.py", line 567, in <lambda>
    z *= (-1) ** alleles.apply(lambda y: FLIP_ALLELES[y])
KeyError: 'AGAAGA'
2024/04/22/10:25:05 PM Analysis terminated from error at Mon Apr 22 22:25:05 2024
2024/04/22/10:25:05 PM Total time elapsed: 1.0m:11.41s

This is my command.

mtag.py \
--sumstats  tmp1.tsv,tmp2.tsv,tmp3.tsv \
--snp_name SNP \
--a1_name A1 \
--a2_name A2 \
--eaf_name MAF \
--z_name z_score \
--n_name NMISS \
--chr_name CHR \
--bpos_name BP \
--p_name P \
--maf_min 0 \
--n_min 0 \
--force \
--ld_ref_panel eas_ldscores_c/ \
--out mtag_result 

Thanks again.

Best, JOOH

JonJala commented 4 months ago

Are you sure you filtered out all the "AGAAGA" SNPs? It looks like there are perhaps still some in your sample based on that error message.

test12138jooh commented 4 months ago

Yeah; Thanks for your reply, But I wonder that does it mean MATG can only be applied to the SNV other than indel?

paturley commented 4 months ago

The current MTAG software can only handle SNVs, though as you saw in another issue, it sounds like it's not too complicated to edit your local instance of MTAG to accept non-SNV data.

On Mon, Apr 22, 2024, 10:49 PM test12138jooh @.***> wrote:

Yeah; Thanks for your reply, But I wonder that does it mean MATG can only be applied to the SNV other than indel?

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/209#issuecomment-2071310971, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5JLUMNTEOA2G62OEETY6XD3DAVCNFSM6AAAAABGTCYMZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZRGMYTAOJXGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

test12138jooh commented 4 months ago

Thank you for your response! I've noticed that including only SNVs works well. Moreover, when utilizing the European reference panel, it effectively manages indels, unlike with other reference panels where it encounters failures. Is it possible that the reference panel itself is causing issues? Another concern is the SNP ID when using WGS data, as it's composed of chr:bp:ref:alt, with many not annotated by rsID.

paturley commented 4 months ago

I suspect that the problem is because LDSC doesn't allow for non-SNVs and may require rsids (which would be a problem with the reference data, as you said), and MTAG inherited those issues from LDSC. There may be an easy fix for this in the MTAG code, but I don't currently have bandwidth to try to carefully update MTAG to work with WGS data. Sorry.

On Tue, Apr 23, 2024 at 8:31 AM test12138jooh @.***> wrote:

Thank you for your response! I've noticed that including only SNVs works well. Moreover, when utilizing the European reference panel, it effectively manages indels, unlike with other reference panels where it encounters failures. Is it possible that the reference panel itself is causing issues? Another concern is the SNP ID when using WGS data, as it's composed of chr:bp:ref:alt, with many not annotated by rsID.

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/209#issuecomment-2072178089, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5JDRCURP4FA6NG76WLY6ZIC7AVCNFSM6AAAAABGTCYMZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANZSGE3TQMBYHE . You are receiving this because you commented.Message ID: @.***>

test12138jooh commented 2 months ago

Dear professor, It seems that MTAG produces higher effect, How to explain it?Is it normal? Attached is the log file. Hope for your response. Thanks!

Change is great and seems unreliable. type| beta | SE | P raw_result | -0.01 | 0.01852 | 0.71 mtag | -0.0623 | 0.0136 | 4.58E-06

MTAG.log

paturley commented 2 months ago

Hi,

Is this just for a single SNP? MTAG results are based on standardized effects, so if you want a fair comparison, you need to compare the estimates after running MTAG on the single trait to the two-trait MTAG that you report above. Your log file looks mostly reasonable to me though.

test12138jooh commented 2 months ago

Thanks for your reply. More specifically, I compared the raw results of SNP from GWAS summary data and the results of MTAG. I found the effect size of some snps have been greatly changed.How can this change be explained? Is this locus reliable? Are the effect sizes from MTAG trust worthy? The snp I showed above : In pheno1 GWAS summary data: beta=-0.01; SE=0.01852; P=0.71 In pheno2 GWAS summary data:beta=-0.024; SE=0.0035; P=4.30E-12 In MTAG result of pheno1: beta=-0.0623 ;SE=0.0136; P=4.58E-06

paturley commented 2 months ago

As I said, MTAG effect sizes are in standardized units. That is, it's the effect of a one-allele change in the SNP on the number of standard deviations of the phenotype. The original GWAS betas would just be in units of the original phenotype. I presume that the difference is due to that, but it could be a lot of other things too. Generally, for any meta-analysis-like procedure, some SNPs may change substantially just due to chance.

test12138jooh commented 2 months ago

Thank you. But I have normlized the phenotype during GWAS analysis. So whether the beta in the MTAG analysis can be used for report?

paturley commented 2 months ago

I think you should be fine then.

test12138jooh commented 2 months ago

Thank you again for your patience in answering my questions.

test12138jooh commented 2 months ago

However,I noticed that some snps effect has also changed. Is it also Normal?Is there any soultions to these snp like heterogeneity test in meta analysis? RAW phenotype1 GWAS summary:beta=-0.008;SE=0.02;P=0.622 RAW phenotype2 GWAS summary:beta=0.0173;SE=0.0031;P=1.99E-08 MTAG phenotype1 summary:beta=0.0222;SE=0.0046;P=1.79E-06

paturley commented 2 months ago

Do you mean the sign has changed?

On Thu, Jun 13, 2024, 11:16 AM test12138jooh @.***> wrote:

However,I noticed that some snps effect has also changed. Is it also Normal?Is there any soultions to these snp like heterogeneity test in meta analysis? RAW phenotype1 GWAS summary:beta=-0.008;SE=0.02;P=0.622 RAW phenotype2 GWAS summary:beta=0.0173;SE=0.0031;P=1.99E-08 MTAG phenotype1 summary:beta=0.0222;SE=0.0046;P=1.79E-06

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/209#issuecomment-2165976915, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5K4IG62NJQZAZX3SG3ZHGZUTAVCNFSM6AAAAABGTCYMZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRVHE3TMOJRGU . You are receiving this because you commented.Message ID: @.***>

test12138jooh commented 2 months ago

yes

paturley commented 2 months ago

Looking at these standard errors, you have very little information about the true sign before and still pretty limited information afterwards. None of these results are very close to genome wide significant.

On Thu, Jun 13, 2024, 11:19 AM test12138jooh @.***> wrote:

yes

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/209#issuecomment-2165984478, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5LUKT3L7E2WXNBFE63ZHGZ6PAVCNFSM6AAAAABGTCYMZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRVHE4DINBXHA . You are receiving this because you commented.Message ID: @.***>

test12138jooh commented 2 months ago

So,does this mean that the mtag result was not informative enough when the snp did not reach the genome wide significant. I feel that the results of these SNPs are not very reliable. Is it possible that the significantly larger sample size of my pheno2(N=160k) compared to pheno1(N=7K) caused this situation?

paturley commented 2 months ago

I don't think so. Your problem is that you are looking at SNPs that are imprecisely estimated in both the GWAS and the MTAG. It is not surprising that a SNP that has such large p-value switches signs. This is not an MTAG problem as much as a power problem.

On Thu, Jun 13, 2024 at 11:36 AM test12138jooh @.***> wrote:

214 https://github.com/JonJala/mtag/issues/214

Will this be a solution to my problem?

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/209#issuecomment-2166027860, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5LONH7E6MBDMNNOFN3ZHG373AVCNFSM6AAAAABGTCYMZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGAZDOOBWGA . You are receiving this because you commented.Message ID: @.***>

test12138jooh commented 2 months ago

Thanks for your reply. It is true that the SNP is imprecise in GWAS (P-value=0.6). But how to define the imprecise SNP in MTAG; It still has a realtive low P value in MTAG (1.79E-06) while it did not reach the genome wide significant.

paturley commented 2 months ago

You interpret MTAG p-values the same way that you'd interpret GWAS p-values. After a multiple testing correction, you can't reject the null that this SNP has a zero effect.

On Thu, Jun 13, 2024 at 1:13 PM test12138jooh @.***> wrote:

Thanks for your reply. It is true that the SNP is imprecise in GWAS (P-value=0.6). But how to define the imprecise SNP in MTAG; It still has a realtive low P value in MTAG (1.79E-06) while it did not reach the genome wide significant.

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/209#issuecomment-2166354255, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5OKKSGXRS6FTBSRRRTZHHHMNAVCNFSM6AAAAABGTCYMZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGM2TIMRVGU . You are receiving this because you commented.Message ID: @.***>

test12138jooh commented 2 months ago

But another example: RAW phenotype1 GWAS summary:beta=-0.0060;SE=0.01;P=0.665 RAW phenotype2 GWAS summary:beta=-0.024;SE=0.0025;P=4.73E-21 MTAG phenotype1 summary:beta=-0.035;SE=0.0041;P=1.35E-17

This snp are precisely estimated in MTAG but not GWAS and the beta has greatly changed.

paturley commented 2 months ago

But it really hasn't. Given the magnitude of the SE, the genome-wide significant confidence interval on the initial estimate is

beta +/- 5.45 * SE = [-.06, .05]

The MTAG estimate is well within this range.

On Thu, Jun 13, 2024 at 1:27 PM test12138jooh @.***> wrote:

But another example: RAW phenotype1 GWAS summary:beta=-0.0060;SE=0.01;P=0.665 RAW phenotype2 GWAS summary:beta=-0.024;SE=0.0025;P=4.73E-21 MTAG phenotype1 summary:beta=-0.035;SE=0.0041;P=1.35E-17

This snp are precisely estimated in MTAG but not GWAS and the beta has greatly changed.

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/209#issuecomment-2166392228, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5OAJMSVNRM6MVTHKPTZHHI7NAVCNFSM6AAAAABGTCYMZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGM4TEMRSHA . You are receiving this because you commented.Message ID: @.***>

test12138jooh commented 2 months ago

So, can I consider the MTAG results reasonable as long as they fall within the confidence interval of the original results?

paturley commented 2 months ago

You can consider the MTAG results consistent with the GWAS results if there aren't statistically significant differences between them. That is not equivalent to being in the confidence interval, but it's a close approximation.

On Thu, Jun 13, 2024 at 1:38 PM test12138jooh @.***> wrote:

So, can I consider the MTAG results reasonable as long as they fall within the confidence interval of the original results?

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/209#issuecomment-2166416064, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5LCJMN6JB4HFA3MJTTZHHKK7AVCNFSM6AAAAABGTCYMZCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGQYTMMBWGQ . You are receiving this because you commented.Message ID: @.***>

test12138jooh commented 2 months ago

Thank you for your help.