JonJala / mtag

Python command line tool for Multi-Trait Analysis of GWAS (MTAG)
GNU General Public License v3.0
173 stars 55 forks source link

LDSC Installation problems(easy fixed) and SIGMA estimation error #87

Open wsproviero opened 4 years ago

wsproviero commented 4 years ago

Hi, I have git cloned the MTAG directory and created the conda environment. mtag.py works just fine but both ldsc.py and munge_sumstats.py do not work at all. python ldsc.py -h throughs at me this error: Traceback (most recent call last): File "ldsc.py", line 14, in import ldscore.sumstats as sumstats File "/Users/[username]/bin/GWAS_META_LD/try/mtag/ldsc_mod/ldscore/sumstats.py", line 15, in from ldsc_mod.ldscore import parse as ps ImportError: No module named ldsc_mod.ldscore

So I have downloaded ldsc separately and substituted the ldsc existing package with the one downloaded from the github official page.

I run the commands with no problem until mtag throughs at me the following error: [.......] Conversion finished at Fri Feb 14 16:04:16 2020 Total time elapsed: 59.44s <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 2 complete. SNPs remaining: 7800735 <><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

1202640 strand ambiguous SNPs in Trait 1 are included. 1200758 strand ambiguous SNPs in Trait 2 are included. ... Merge of GWAS summary statistics complete. Number of SNPs: 7788953 Using 6588195 SNPs to estimate Omega (1200758 SNPs excluded due to strand ambiguity) Estimating sigma.. Improperly formatted sumstats file: ('Usecols do not match names.',) Traceback (most recent call last): File "/Users/[username]/bin/GWAS_META_LD/mtag/mtag.py", line 1567, in mtag(args) File "/Users/[username]/bin/GWAS_META_LD/mtag/mtag.py", line 1351, in mtag args.sigma_hat = estimate_sigma(DATA[not_SA], args) File "/Users/[username]/bin/GWAS_META_LD/mtag/mtag.py", line 465, in estimate_sigma rg_results_t = sumstats_sig.estimate_rg(args_ldsc_rg, Logger_to_Logging()) File "/Users/[username]/bin/GWAS_META_LD/mtag/ldsc_mod/ldscore/sumstats.py", line 397, in estimate_rg alleles=True, dropna=True) File "/Users/[username]/bin/GWAS_META_LD/mtag/ldsc_mod/ldscore/sumstats.py", line 242, in _read_ld_sumstats sumstats = _read_sumstats(args, log, fh, alleles=alleles, dropna=dropna) File "/Users/[username]/bin/GWAS_META_LD/mtag/ldsc_mod/ldscore/sumstats.py", line 163, in _read_sumstats sumstats = ps.sumstats(fh, alleles=alleles, dropna=dropna) File "/Users/[username]/bin/GWAS_META_LD/mtag/ldsc_mod/ldscore/parse.py", line 81, in sumstats raise ValueError('Improperly formatted sumstats file: ' + str(e.args)) ValueError: Improperly formatted sumstats file: ('Usecols do not match names.',) Analysis terminated from error at Fri Feb 14 16:08:15 2020 Total time elapsed: 7.0m:0.93s

I have tried to tweak my files several times but the error persists.

Can you help me please? Thank you. William

paturley commented 4 years ago

It looks to me like the software is not able to parse the column names in your summary statistic files. Sometimes it helps to remove any columns from your sum stats files other than those that are required by the MTAG and LDSC software. Does that seem to help?

On Fri, Feb 14, 2020 at 11:19 AM wsproviero notifications@github.com wrote:

Hi, I have git cloned the MTAG directory and created the conda environment. mtag.py works just fine but both ldsc.py and munge_sumstats.py do not work at all. python ldsc.py -h throughs at me this error: Traceback (most recent call last): File "ldsc.py", line 14, in import ldscore.sumstats as sumstats File "/Users/[username]/bin/GWAS_META_LD/try/mtag/ldsc_mod/ldscore/sumstats.py", line 15, in from ldsc_mod.ldscore import parse as ps ImportError: No module named ldsc_mod.ldscore

So I have downloaded ldsc separately and substituted the ldsc existing package with the one downloaded from the github official page.

I run the commands with no problem until mtag throughs at me the following error: [.......] Conversion finished at Fri Feb 14 16:04:16 2020 Total time elapsed: 59.44s

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> Munging of Trait 2 complete. SNPs remaining: 7800735

<><><<>><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

1202640 strand ambiguous SNPs in Trait 1 are included. 1200758 strand ambiguous SNPs in Trait 2 are included. ... Merge of GWAS summary statistics complete. Number of SNPs: 7788953 Using 6588195 SNPs to estimate Omega (1200758 SNPs excluded due to strand ambiguity) Estimating sigma.. Improperly formatted sumstats file: ('Usecols do not match names.',) Traceback (most recent call last): File "/Users/[username]/bin/GWAS_META_LD/mtag/mtag.py", line 1567, in mtag(args) File "/Users/[username]/bin/GWAS_META_LD/mtag/mtag.py", line 1351, in mtag args.sigma_hat = estimate_sigma(DATA[not_SA], args) File "/Users/[username]/bin/GWAS_META_LD/mtag/mtag.py", line 465, in estimate_sigma rg_results_t = sumstats_sig.estimate_rg(args_ldsc_rg, Logger_to_Logging()) File "/Users/[username]/bin/GWAS_META_LD/mtag/ldsc_mod/ldscore/sumstats.py", line 397, in estimate_rg alleles=True, dropna=True) File "/Users/[username]/bin/GWAS_META_LD/mtag/ldsc_mod/ldscore/sumstats.py", line 242, in _read_ld_sumstats sumstats = _read_sumstats(args, log, fh, alleles=alleles, dropna=dropna) File "/Users/[username]/bin/GWAS_META_LD/mtag/ldsc_mod/ldscore/sumstats.py", line 163, in _read_sumstats sumstats = ps.sumstats(fh, alleles=alleles, dropna=dropna) File "/Users/[username]/bin/GWAS_META_LD/mtag/ldsc_mod/ldscore/parse.py", line 81, in sumstats raise ValueError('Improperly formatted sumstats file: ' + str(e.args)) ValueError: Improperly formatted sumstats file: ('Usecols do not match names.',) Analysis terminated from error at Fri Feb 14 16:08:15 2020 Total time elapsed: 7.0m:0.93s

I have tried to tweak my files several times but the error persists.

Can you help me please? Thank you. William

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/87?email_source=notifications&email_token=AFBUB5KDGQJHECX4SD5SNWTRC3ACNA5CNFSM4KVLJZUKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4INTXA7Q, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5JAHQ23LXH5UX5EODTRC3ACNANCNFSM4KVLJZUA .

wsproviero commented 4 years ago

Thank you for your fast response! Actually it did not solve the problem. I have reduced the columns to a minimum of: chr snpid bpos a2 a1 z p n. I believe it must be a problem of compatibility between the LDSC version 1.0.0 (that seems not working under the conda environment) and the LDSC version 1.0.1 that I replaced after downloading it from git hub repository. As I mentioned, the version 1.0.1 works just fine on its own. However it keeps throwing the following message when calculating sigma: Estimating sigma.. 'list' object has no attribute 'shape' Traceback (most recent call last): File "/Users/[username]/bin/GWAS_META_LD/try/mtag/mtag.py", line 1567, in mtag(args) File "/Users/[username]/bin/GWAS_META_LD/try/mtag/mtag.py", line 1351, in mtag args.sigma_hat = estimate_sigma(DATA[not_SA], args) File "/Users/[username]/bin/GWAS_META_LD/try/mtag/mtag.py", line 466, in estimate_sigma sigma_hat[t,t] = ldsc_matrix_formatter(rg_results_t, '.gencov.intercept')[0] File "/Users/[username]/bin/GWAS_META_LD/try/mtag/mtag.py", line 416, in ldsc_matrix_formatter (nrow, ncol) = result_rg.shape AttributeError: 'list' object has no attribute 'shape'.

The version 1.0.0 provided throws at me the following error when I run "python ldsc.py -h": Traceback (most recent call last): File "ldsc.py", line 14, in import ldscore.sumstats as sumstats File "/Users/[username]/bin/GWAS_META_LD/try/mtag/ldsc_mod/ldscore/sumstats.py", line 15, in from ldsc_mod.ldscore import parse as ps ImportError: No module named ldsc_mod.ldscore

paturley commented 4 years ago

Sorry it has been a bit slower to respond to this. It's been a busy week here.

The MTAG version of LDSC has a few changes to it relative to the off-the-shelf LDSC, which is probably why they aren't communicating well. Your best bet might be to figure out how to get the MTAG version of LDSC working. Not totally sure what is going wrong. Let me try a few things out as well and I'll get back to you as soon as I can.

Patrick

On Sat, Feb 15, 2020 at 4:04 AM wsproviero notifications@github.com wrote:

Thank you for your fast response! Actually it did not solve the problem. I have reduced the columns to a minimum of: chr snpid bpos a2 a1 z p n. I believe it must be a problem of compatibility between the LDSC version 1.0.0 (that seems not working under the conda environment) and the LDSC version 1.0.1 that I replaced after downloading it from git hub repository. As I mentioned, the version 1.0.1 works just fine on its own. However it keeps throwing the following message when calculating sigma: Estimating sigma.. 'list' object has no attribute 'shape' Traceback (most recent call last): File "/Users/[username]/bin/GWAS_META_LD/try/mtag/mtag.py", line 1567, in mtag(args) File "/Users/[username]/bin/GWAS_META_LD/try/mtag/mtag.py", line 1351, in mtag args.sigma_hat = estimate_sigma(DATA[not_SA], args) File "/Users/[username]/bin/GWAS_META_LD/try/mtag/mtag.py", line 466, in estimate_sigma sigma_hat[t,t] = ldsc_matrix_formatter(rg_results_t, '.gencov.intercept')[0] File "/Users/[username]/bin/GWAS_META_LD/try/mtag/mtag.py", line 416, in ldsc_matrix_formatter (nrow, ncol) = result_rg.shape AttributeError: 'list' object has no attribute 'shape'.

The version 1.0.0 provided throws at me the following error when I run "python ldsc.py -h": Traceback (most recent call last): File "ldsc.py", line 14, in import ldscore.sumstats as sumstats File "/Users/[username]/bin/GWAS_META_LD/try/mtag/ldsc_mod/ldscore/sumstats.py", line 15, in from ldsc_mod.ldscore import parse as ps ImportError: No module named ldsc_mod.ldscore

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/omeed-maghzian/mtag/issues/87?email_source=notifications&email_token=AFBUB5JAV2CUMCJO3NU4EA3RC7K4NA5CNFSM4KVLJZUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL3I4UA#issuecomment-586583632, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFBUB5OVPKQMDPEIO3P2EHTRC7K4NANCNFSM4KVLJZUA .