JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
172 stars 55 forks source link

ValueError: Could not find SNP column. #191

Open caozilongsuper opened 1 year ago

caozilongsuper commented 1 year ago

I have prepared the format files such as chr,bpos,a1,a2,snpid,beta,se,pval,freq,N,z.When I run the MTAG, I am faced with an error. Here is the log: Calling ./mtag.py \ --stream-stdout \ --n-min 0.0 \ --sumstats finngen_adeno_tomtag.txt,finngen_squam_tomtag.txt,finngen_uteri_tomtag.txt \ --out /share/CZL/result/cancer/Cervix/finngen_mtag.1NS

Beginning MTAG analysis... MTAG will use the Z column for analyses. Read in Trait 1 summary statistics (20155970 SNPs) from finngen_adeno_tomtag.txt ... <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

ERROR converting summary statistics:

Traceback (most recent call last): File "/share/CZL/MTAG/mtag-master/mtag_munge.py", line 802, in munge_sumstats raise ValueError('Could not find {C} column.'.format(C=c)) ValueError: Could not find SNP column.

Conversion finished at Mon Oct 16 21:29:12 2023 Total time elapsed: 0.01s Could not find SNP column. Traceback (most recent call last): File "/share/CZL/MTAG/mtag-master/mtag.py", line 1577, in mtag(args) File "/share/CZL/MTAG/mtag-master/mtag.py", line 1343, in mtag DATA_U, DATA, args = load_and_merge_data(args) File "/share/CZL/MTAG/mtag-master/mtag.py", line 273, in load_and_merge_data GWAS_d[p], sumstats_format[p] = _perform_munge(args, GWAS_d[p], gwas_dat_gen, p) File "/share/CZL/MTAG/mtag-master/mtag.py", line 166, in _perform_munge munged_results = munge_sumstats.munge_sumstats(argnames, write_out=False, new_log=False) File "/share/CZL/MTAG/mtag-master/mtag_munge.py", line 802, in munge_sumstats raise ValueError('Could not find {C} column.'.format(C=c)) ValueError: Could not find SNP column. Analysis terminated from error at Mon Oct 16 21:29:12 2023 Total time elapsed: 43.09s It is much appreciated if you could help me solve such problems!

JonJala commented 1 year ago

Hmm, what is the first line or so of finngen_adeno_tomtag.txt? (the column headers) Just trying to see why it would be having trouble identifying columns.

On Mon, Oct 16, 2023 at 9:58 AM caozilongsupper @.***> wrote:

I have prepared the format files such as chr,bpos,a1,a2,snpid,beta,se,pval,freq,N,z.When I run the MTAG, I am faced with an error. Here is the log: Calling ./mtag.py --stream-stdout --n-min 0.0 --sumstats finngen_adeno_tomtag.txt,finngen_squam_tomtag.txt,finngen_uteri_tomtag.txt --out /share/CZL/result/cancer/Cervix/finngen_mtag.1NS

Beginning MTAG analysis... MTAG will use the Z column for analyses. Read in Trait 1 summary statistics (20155970 SNPs) from finngen_adeno_tomtag.txt ...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

ERROR converting summary statistics:

Traceback (most recent call last): File "/share/CZL/MTAG/mtag-master/mtag_munge.py", line 802, in munge_sumstats raise ValueError('Could not find {C} column.'.format(C=c)) ValueError: Could not find SNP column.

Conversion finished at Mon Oct 16 21:29:12 2023 Total time elapsed: 0.01s Could not find SNP column. Traceback (most recent call last): File "/share/CZL/MTAG/mtag-master/mtag.py", line 1577, in mtag(args) File "/share/CZL/MTAG/mtag-master/mtag.py", line 1343, in mtag DATA_U, DATA, args = load_and_merge_data(args) File "/share/CZL/MTAG/mtag-master/mtag.py", line 273, in load_and_merge_data GWAS_d[p], sumstats_format[p] = _perform_munge(args, GWAS_d[p], gwas_dat_gen, p) File "/share/CZL/MTAG/mtag-master/mtag.py", line 166, in _perform_munge munged_results = munge_sumstats.munge_sumstats(argnames, write_out=False, new_log=False) File "/share/CZL/MTAG/mtag-master/mtag_munge.py", line 802, in munge_sumstats raise ValueError('Could not find {C} column.'.format(C=c)) ValueError: Could not find SNP column. Analysis terminated from error at Mon Oct 16 21:29:12 2023 Total time elapsed: 43.09s It is much appreciated if you could help me solve such problems!

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/191, or unsubscribe https://github.com/notifications/unsubscribe-auth/APIOF57GC2K5S2L3I7HKTGDX7U4PFAVCNFSM6AAAAAA6CHZLA2VHI2DSMVQWIX3LMV43ASLTON2WKOZRHE2DKMRYGM2TQMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

caozilongsuper commented 1 year ago

Thanks for replying,and here is the information. [TCGA@cu03 ~]$ head /share/CZL/pancancer-GWAS/cervix_carcinoma/finngen/finngen_adeno_tomtag.txt chr,bpos,a1,a2,snpid,beta,se,pval,freq,N,z 1,13668,A,G,rs2691328,0.389358,1.61109,0.809033,0.0058238,167301,0.241673649516787 1,14773,T,C,rs878915777,-1.15586,1.008,0.251513,0.0134509,167301,-1.14668650793651 1,15585,A,G,rs533630043,-1.96825,3.15742,0.53304,0.00109271,167301,-0.623372880389685 1,16549,C,T,rs1262014613,-0.318079,4.5183,0.943877,0.000589333,167301,-0.0703979372772946 1,16567,C,G,rs1194064194,1.34032,1.62093,0.408303,0.00422919,167301,0.826883332407939 1,16792,A,G,rs201330479,1.98142,2.53122,0.433749,0.0012803,167301,0.782792487417135 1,16850,C,T,rs1298041101,-0.768738,5.56921,0.890214,0.000259096,167301,-0.13803358106446 1,16869,T,C,rs1201088959,-2.8915,13.1371,0.825792,8.19235e-05,167301,-0.2201018489621 1,17017,G,C,,8.97029,9.22849,0.33104,9.60228e-05,167301,0.972021424956846

JonJala commented 1 year ago

Ah, I think you may want to try a whitespace-delimited file (rather than commas) as per this portion of the wiki https://github.com/JonJala/mtag/wiki/Tutorial-1:-The-Basics#sample-gwas-results-and-data-format. I'm guessing that's why it's having trouble identifying the columns.

On Tue, Oct 17, 2023 at 10:16 PM caozilongsupper @.***> wrote:

Thanks for replying,and here is the information. @.*** ~]$ head /share/CZL/pancancer-GWAS/cervix_carcinoma/finngen/finngen_adeno_tomtag.txt chr,bpos,a1,a2,snpid,beta,se,pval,freq,N,z

1,13668,A,G,rs2691328,0.389358,1.61109,0.809033,0.0058238,167301,0.241673649516787

1,14773,T,C,rs878915777,-1.15586,1.008,0.251513,0.0134509,167301,-1.14668650793651

1,15585,A,G,rs533630043,-1.96825,3.15742,0.53304,0.00109271,167301,-0.623372880389685

1,16549,C,T,rs1262014613,-0.318079,4.5183,0.943877,0.000589333,167301,-0.0703979372772946

1,16567,C,G,rs1194064194,1.34032,1.62093,0.408303,0.00422919,167301,0.826883332407939

1,16792,A,G,rs201330479,1.98142,2.53122,0.433749,0.0012803,167301,0.782792487417135

1,16850,C,T,rs1298041101,-0.768738,5.56921,0.890214,0.000259096,167301,-0.13803358106446

1,16869,T,C,rs1201088959,-2.8915,13.1371,0.825792,8.19235e-05,167301,-0.2201018489621 1,17017,G,C,,8.97029,9.22849,0.33104,9.60228e-05,167301,0.972021424956846

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/191#issuecomment-1767503696, or unsubscribe https://github.com/notifications/unsubscribe-auth/APIOF5ZKDFWW5FNAR5Y2QKLX743YVAVCNFSM6AAAAAA6CHZLA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRXGUYDGNRZGY . You are receiving this because you commented.Message ID: @.***>

caozilongsuper commented 1 year ago

Thanks for offering help and first error has been solved. But here comes another problem. Read 20156086 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 1362973 SNPs with out-of-bounds p-values. Removed 1400697 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 17392416 SNPs remain. Removed 58485 SNPs with duplicated rs numbers (17333931 SNPs remain). Removed 0 SNPs with N < 0.0 (17333931 SNPs remain).

ERROR converting summary statistics:

Traceback (most recent call last): File "/home/CZL/software/MTAG/mtag-master/mtag_munge.py", line 882, in munge_sumstats check_median(dat.SIGNED_SUMSTAT, signed_sumstat_null, args.median_z_cutoff, sign_cname)) File "/home/CZL/software/MTAG/mtag-master/mtag_munge.py", line 525, in check_median raise ValueError(msg.format(F=name, M=expected_median, V=round(m, 2))) ValueError: WARNING: median value of SIGNED_SUMSTAT is -0.22 (should be close to 0.0). This column may be mislabeled.

JonJala commented 1 year ago

You could try making use of the median_z_cutoff flag to relax the threshold on that check:

https://github.com/JonJala/mtag/blob/master/mtag.py#L1540C4-L1540C4

On Wed, Oct 25, 2023 at 9:11 AM caozilongsupper @.***> wrote:

Thanks for offering help and first error has been solved. But here comes another problem. Read 20156086 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 1362973 SNPs with out-of-bounds p-values. Removed 1400697 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 17392416 SNPs remain. Removed 58485 SNPs with duplicated rs numbers (17333931 SNPs remain). Removed 0 SNPs with N < 0.0 (17333931 SNPs remain).

ERROR converting summary statistics:

Traceback (most recent call last): File "/home/CZL/software/MTAG/mtag-master/mtag_munge.py", line 882, in munge_sumstats check_median(dat.SIGNED_SUMSTAT, signed_sumstat_null, args.median_z_cutoff, sign_cname)) File "/home/CZL/software/MTAG/mtag-master/mtag_munge.py", line 525, in check_median raise ValueError(msg.format(F=name, M=expected_median, V=round(m, 2))) ValueError: WARNING: median value of SIGNED_SUMSTAT is -0.22 (should be close to 0.0). This column may be mislabeled.

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/191#issuecomment-1779245416, or unsubscribe https://github.com/notifications/unsubscribe-auth/APIOF53CCUGFWJMQAJV5KB3YBEFZ3AVCNFSM6AAAAAA6CHZLA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONZZGI2DKNBRGY . You are receiving this because you commented.Message ID: @.***>

caozilongsuper commented 1 year ago

Add it in the code just like this? python /home/CZL/software/MTAG/mtag-master/mtag.py \ --sumstats finngen_adeno_tomtag1.txt,finngen_squam_tomtag1.txt,finngen_uteri_tomtag1.txt \ --median_z_cutoff --out /share/CZL/result/cancer/Cervix/finngen_mtag.1NS \ --n_min 0.0 \ --stream_stdout Thank you in advance!

caozilongsuper commented 1 year ago

Hi JonJala, I have fixed SIGNED_SUMSTAT problem using --median_z_cutoff, but here comes another one. Total time elapsed: 3.0m:36.14s 'snpid' Traceback (most recent call last): File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 1577, in mtag(args) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 1343, in mtag DATA_U, DATA, args = load_and_merge_data(args) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 273, in load_and_merge_data GWAS_d[p], sumstats_format[p] = _perform_munge(args, GWAS_d[p], gwas_dat_gen, p) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 167, in _perform_munge GWAS_df = GWAS_df.merge(munged_results, how='inner',left_on =args.snp_name,right_on='SNP',suffixes=('','_ss')) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/frame.py", line 6868, in merge copy=copy, indicator=indicator, validate=validate) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 47, in merge validate=validate) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 529, in init self.join_names) = self._get_merge_keys() File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 846, in _get_merge_keys left_keys.append(left._get_label_or_level_values(lk)) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/generic.py", line 1706, in _get_label_or_level_values raise KeyError(key) KeyError: 'snpid' Analysis terminated from error at Thu Oct 26 17:05:34 2023 Total time elapsed: 4.0m:40.98s Sincerely thanks in advance!

JonJala commented 1 year ago

Do you mind including the whole log? It might help to see what came before.

On Thu, Oct 26, 2023 at 5:14 AM caozilongsupper @.***> wrote:

Hi JonJala, I have fixed SIGNED_SUMSTAT problem using --median_z_cutoff, but here comes another one. Total time elapsed: 3.0m:36.14s 'snpid' Traceback (most recent call last): File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 1577, in mtag(args) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 1343, in mtag DATA_U, DATA, args = load_and_merge_data(args) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 273, in load_and_merge_data GWAS_d[p], sumstats_format[p] = _perform_munge(args, GWAS_d[p], gwas_dat_gen, p) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 167, in _perform_munge GWAS_df = GWAS_df.merge(munged_results, how='inner',left_on =args.snp_name,right_on='SNP',suffixes=('','_ss')) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/frame.py", line 6868, in merge copy=copy, indicator=indicator, validate=validate) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 47, in merge validate=validate) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 529, in init self.join_names) = self._get_merge_keys() File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 846, in _get_merge_keys left_keys.append(left._get_label_or_level_values(lk)) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/generic.py", line 1706, in _get_label_or_level_values raise KeyError(key) KeyError: 'snpid' Analysis terminated from error at Thu Oct 26 17:05:34 2023 Total time elapsed: 4.0m:40.98s Sincerely thanks in advance!

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/191#issuecomment-1780723864, or unsubscribe https://github.com/notifications/unsubscribe-auth/APIOF525Y5FRTQGE7SFTYCTYBISWJAVCNFSM6AAAAAA6CHZLA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBQG4ZDGOBWGQ . You are receiving this because you commented.Message ID: @.***>

caozilongsuper commented 1 year ago

This is the whole log: <> MTAG: Multi-trait Analysis of GWAS <> Version: 1.0.8 <> (C) 2017 Omeed Maghzian, Raymond Walters, and Patrick Turley <> Harvard University Department of Economics / Broad Institute of MIT and Harvard <> GNU General Public License v3 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Note: It is recommended to run your own QC on the input before using this program. <> Software-related correspondence: jjala.ssgac@gmail.com <> All other correspondence: paturley@broadinstitute.org <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Calling ./mtag.py \ --stream-stdout \ --n-min 0.0 \ --median-z-cutoff 1.0 \ --sumstats finngen_adeno_tomtag1.txt,finngen_squam_tomtag1.txt,finngen_uteri_tomtag1.txt \ --out /share/CZL/result/cancer/Cervix/finngen_mtag.1NS

Beginning MTAG analysis... WARNING: Using non-default median Z score cutoff for QC. MTAG will use the Z column for analyses. Read in Trait 1 summary statistics (20156086 SNPs) from finngen_adeno_tomtag1.txt ... <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><>< <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: N: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. SNP: Variant ID (e.g., rs number) z: Directional summary statistic as specified by --signed-sumstats. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. WARNING: 327274 SNPs had P outside of (0,1]. The P column may be mislabeled. WARNING: 879613 SNPs had P outside of (0,1]. The P column may be mislabeled. WARNING: 156086 SNPs had P outside of (0,1]. The P column may be mislabeled. Read 20156086 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 1362973 SNPs with out-of-bounds p-values. Removed 1400697 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 17392416 SNPs remain. Removed 58485 SNPs with duplicated rs numbers (17333931 SNPs remain). Removed 0 SNPs with N < 0.0 (17333931 SNPs remain). Median value of SIGNED_SUMSTAT was -0.215595440827, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.93 WARNING: mean chi^2 may be too small. Lambda GC = 0.86 Max chi^2 = 29.088 1 Genome-wide significant SNPs (some may have been removed by filtering).

Conversion finished at Sat Oct 28 21:07:43 2023 Total time elapsed: 6.0m:53.26s 'snpid' Traceback (most recent call last): File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 1577, in mtag(args) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 1343, in mtag DATA_U, DATA, args = load_and_merge_data(args) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 273, in load_and_merge_data GWAS_d[p], sumstats_format[p] = _perform_munge(args, GWAS_d[p], gwas_dat_gen, p) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 167, in _perform_munge GWAS_df = GWAS_df.merge(munged_results, how='inner',left_on =args.snp_name,right_on='SNP',suffixes=('','_ss')) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/frame.py", line 6868, in merge copy=copy, indicator=indicator, validate=validate) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 47, in merge validate=validate) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 529, in init self.join_names) = self._get_merge_keys() File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 846, in _get_merge_keys left_keys.append(left._get_label_or_level_values(lk)) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/generic.py", line 1706, in _get_label_or_level_values raise KeyError(key) KeyError: 'snpid' Analysis terminated from error at Sat Oct 28 21:07:44 2023 Total time elapsed: 8.0m:29.31s Thank you for offering me critical help!

JonJala commented 1 year ago

Hmm, it's unclear what's going on there. Have you switched all your files to be whitespace-delimited? And there is a "snpid" column in each?

On Sat, Oct 28, 2023 at 9:14 AM caozilongsupper @.***> wrote:

This is the whole log: <> MTAG: Multi-trait Analysis of GWAS <> Version: 1.0.8 <> (C) 2017 Omeed Maghzian, Raymond Walters, and Patrick Turley <> Harvard University Department of Economics / Broad Institute of MIT and Harvard <> GNU General Public License v3

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Note: It is recommended to run your own QC on the input before using this program. <> Software-related correspondence: @. <> All other correspondence: @.

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

Calling ./mtag.py --stream-stdout --n-min 0.0 --median-z-cutoff 1.0 --sumstats finngen_adeno_tomtag1.txt,finngen_squam_tomtag1.txt,finngen_uteri_tomtag1.txt

--out /share/CZL/result/cancer/Cervix/finngen_mtag.1NS

Beginning MTAG analysis... WARNING: Using non-default median Z score cutoff for QC. MTAG will use the Z column for analyses. Read in Trait 1 summary statistics (20156086 SNPs) from finngen_adeno_tomtag1.txt ...

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging Trait 1 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Interpreting column names as follows: N: Sample size a1: a1, interpreted as ref allele for signed sumstat. pval: p-Value a2: a2, interpreted as non-ref allele for signed sumstat. SNP: Variant ID (e.g., rs number) z: Directional summary statistic as specified by --signed-sumstats. se: Standard errors of BETA coefficients

Reading sumstats from provided DataFrame into memory 10000000 SNPs at a time. WARNING: 327274 SNPs had P outside of (0,1]. The P column may be mislabeled. WARNING: 879613 SNPs had P outside of (0,1]. The P column may be mislabeled. WARNING: 156086 SNPs had P outside of (0,1]. The P column may be mislabeled. Read 20156086 SNPs from --sumstats file. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= None. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with SE <0 or NaN values. Removed 1362973 SNPs with out-of-bounds p-values. Removed 1400697 variants that were not SNPs. Note: strand ambiguous SNPs were not dropped. 17392416 SNPs remain. Removed 58485 SNPs with duplicated rs numbers (17333931 SNPs remain). Removed 0 SNPs with N < 0.0 (17333931 SNPs remain). Median value of SIGNED_SUMSTAT was -0.215595440827, which seems sensible. Dropping snps with null values

Metadata: Mean chi^2 = 0.93 WARNING: mean chi^2 may be too small. Lambda GC = 0.86 Max chi^2 = 29.088 1 Genome-wide significant SNPs (some may have been removed by filtering).

Conversion finished at Sat Oct 28 21:07:43 2023 Total time elapsed: 6.0m:53.26s 'snpid' Traceback (most recent call last): File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 1577, in mtag(args) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 1343, in mtag DATA_U, DATA, args = load_and_merge_data(args) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 273, in load_and_merge_data GWAS_d[p], sumstats_format[p] = _perform_munge(args, GWAS_d[p], gwas_dat_gen, p) File "/home/CZL/software/MTAG/mtag-master/mtag.py", line 167, in _perform_munge GWAS_df = GWAS_df.merge(munged_results, how='inner',left_on =args.snp_name,right_on='SNP',suffixes=('','_ss')) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/frame.py", line 6868, in merge copy=copy, indicator=indicator, validate=validate) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 47, in merge validate=validate) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 529, in init self.join_names) = self._get_merge_keys() File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/reshape/merge.py", line 846, in _get_merge_keys left_keys.append(left._get_label_or_level_values(lk)) File "/home/CZL/.conda/envs/czl/lib/python2.7/site-packages/pandas/core/generic.py", line 1706, in _get_label_or_level_values raise KeyError(key) KeyError: 'snpid' Analysis terminated from error at Sat Oct 28 21:07:44 2023 Total time elapsed: 8.0m:29.31s Thank you for offering me critical help!

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/191#issuecomment-1783807071, or unsubscribe https://github.com/notifications/unsubscribe-auth/APIOF54EO3Z4DWIIVFOIRWDYBUAMDAVCNFSM6AAAAAA6CHZLA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBTHAYDOMBXGE . You are receiving this because you commented.Message ID: @.***>

caozilongsuper commented 1 year ago

There is another question. I have checked my input file data and all pval column is between (0,1]. But I still meet this log: WARNING: 327274 SNPs had P outside of (0,1]. The P column may be mislabeled. WARNING: 879613 SNPs had P outside of (0,1]. The P column may be mislabeled. WARNING: 156086 SNPs had P outside of (0,1]. The P column may be mislabeled.

paturley commented 1 year ago

Sometimes MTAG gets confused about what columns are what. Do you have many more than the required columns in your summary statistics file?

On Tue, Oct 31, 2023, 3:49 AM caozilongsupper @.***> wrote:

There is another question. I have checked my input file data and all pval column is between (0,1]. But I still meet this log: WARNING: 327274 SNPs had P outside of (0,1]. The P column may be mislabeled. WARNING: 879613 SNPs had P outside of (0,1]. The P column may be mislabeled. WARNING: 156086 SNPs had P outside of (0,1]. The P column may be mislabeled.

— Reply to this email directly, view it on GitHub https://github.com/JonJala/mtag/issues/191#issuecomment-1786679195, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5IB7EUO4EYI5BXFDBLYCCUSHAVCNFSM6AAAAAA6CHZLA2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBWGY3TSMJZGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

caozilongsuper commented 1 year ago

Dear paturley, Many thanks for your quick response to all my questions, it is very helpful! Best, Zilong Cao