bulik / ldsc

LD Score Regression (LDSC)
GNU General Public License v3.0
652 stars 344 forks source link

IndexError: list index out of range #366

Open cwnag-c opened 1 year ago

cwnag-c commented 1 year ago

Getting the following error. Any help would be appreciated! Thanks.

Beginning analysis at Mon Dec 5 03:18:36 2022 Reading summary statistics from pgc.sumstats.gz ... Read summary statistics for 1086749 SNPs. Reading reference panel LD Score from eur_w_ld_chr[1-22] ... (ldscore_fromlist) Traceback (most recent call last): File "/content/ldsc/ldsc.py", line 644, in sumstats.estimate_h2(args, log) File "/content/ldsc/ldscore/sumstats.py", line 326, in estimate_h2 args, log, args.h2) File "/content/ldsc/ldscore/sumstats.py", line 243, in _read_ld_sumstats ref_ld = _read_ref_ld(args, log) File "/content/ldsc/ldscore/sumstats.py", line 82, in _read_ref_ld 'reference panel LD Score', ps.ldscore_fromlist) File "/content/ldsc/ldscore/sumstats.py", line 152, in _read_chr_split_files out = parsefunc(_splitp(chr_arg), _N_CHR, **kwargs) File "/content/ldsc/ldscore/parse.py", line 103, in ldscore_fromlist y = ldscore(fh, num) File "/content/ldsc/ldscore/parse.py", line 147, in ldscore first_fh = sub_chr(fh, chrs[0]) + suffix IndexError: list index out of range

Analysis finished at Mon Dec 5 03:18:38 2022 Total time elapsed: 1.45s Traceback (most recent call last): File "/content/ldsc/ldsc.py", line 644, in sumstats.estimate_h2(args, log) File "/content/ldsc/ldscore/sumstats.py", line 326, in estimate_h2 args, log, args.h2) File "/content/ldsc/ldscore/sumstats.py", line 243, in _read_ld_sumstats ref_ld = _read_ref_ld(args, log) File "/content/ldsc/ldscore/sumstats.py", line 82, in _read_ref_ld 'reference panel LD Score', ps.ldscore_fromlist) File "/content/ldsc/ldscore/sumstats.py", line 152, in _read_chr_split_files out = parsefunc(_splitp(chr_arg), _N_CHR, **kwargs) File "/content/ldsc/ldscore/parse.py", line 103, in ldscore_fromlist y = ldscore(fh, num) File "/content/ldsc/ldscore/parse.py", line 147, in ldscore first_fh = sub_chr(fh, chrs[0]) + suffix IndexError: list index out of range

tabpeter commented 1 year ago

I am getting the same error message when I run this command with my data --

jjandshi commented 1 year ago

ldsc.py worked well for the same input file a week ago. Now I had the same problem. Here is the log. ./ldsc.py \ --h2 mydata.sumstats.gz \ --ref-ld-chr eur_w_ld_chr/ \ --out Try_h2 \ --w-ld-chr eur_w_ld_chr/

Beginning analysis at Fri Apr 7 23:13:58 2023 Reading summary statistics from mydata.sumstats.gz ... Read summary statistics for 1183117 SNPs. Reading reference panel LD Score from eur_w_ld_chr/[1-22] ... (ldscore_fromlist) Traceback (most recent call last): File "ldsc.py", line 644, in sumstats.estimate_h2(args, log) File "//ldsc/ldscore/sumstats.py", line 326, in estimate_h2 args, log, args.h2) File "//ldsc/ldscore/sumstats.py", line 243, in _read_ld_sumstats ref_ld = _read_ref_ld(args, log) File "//ldsc/ldscore/sumstats.py", line 82, in _read_ref_ld 'reference panel LD Score', ps.ldscore_fromlist) File "//ldscore/sumstats.py", line 152, in _read_chr_split_files out = parsefunc(_splitp(chr_arg), _N_CHR, **kwargs) File "//ldsc/ldscore/parse.py", line 103, in ldscore_fromlist y = ldscore(fh, num) File "//ldsc/ldscore/parse.py", line 147, in ldscore first_fh = sub_chr(fh, chrs[0]) + suffix IndexError: list index out of range

jjandshi commented 1 year ago

Download ref datasets and snp.list from https://ibg.colorado.edu/cdrom2021/Day06-nivard/GenomicSEM_practical/eur_w_ld_chr/. ldsc works again.

aydanasg commented 1 year ago

This could also be due to the wrong path to the annotation file

CaoLuolong commented 1 year ago

I am getting the same error message when I run this command with my data --

Getting the following error. Any help would be appreciated! Thanks.

  • LD Score Regression (LDSC)
  • Version 1.0.1
  • (C) 2014-2019 Brendan Bulik-Sullivan and Hilary Finucane
  • Broad Institute of MIT and Harvard / MIT Department of Mathematics
  • GNU General Public License v3

Call: ./ldsc.py --h2 pgc.sumstats.gz --ref-ld-chr eur_w_ld_chr --out pgc_h2 --w-ld-chr eur_w_ld_chr

Beginning analysis at Mon Dec 5 03:18:36 2022 Reading summary statistics from pgc.sumstats.gz ... Read summary statistics for 1086749 SNPs. Reading reference panel LD Score from eur_w_ld_chr[1-22] ... (ldscore_fromlist) Traceback (most recent call last): File "/content/ldsc/ldsc.py", line 644, in sumstats.estimate_h2(args, log) File "/content/ldsc/ldscore/sumstats.py", line 326, in estimate_h2 args, log, args.h2) File "/content/ldsc/ldscore/sumstats.py", line 243, in _read_ld_sumstats ref_ld = _read_ref_ld(args, log) File "/content/ldsc/ldscore/sumstats.py", line 82, in _read_ref_ld 'reference panel LD Score', ps.ldscore_fromlist) File "/content/ldsc/ldscore/sumstats.py", line 152, in _read_chr_split_files out = parsefunc(_splitp(chr_arg), _N_CHR, **kwargs) File "/content/ldsc/ldscore/parse.py", line 103, in ldscore_fromlist y = ldscore(fh, num) File "/content/ldsc/ldscore/parse.py", line 147, in ldscore first_fh = sub_chr(fh, chrs[0]) + suffix IndexError: list index out of range

Analysis finished at Mon Dec 5 03:18:38 2022 Total time elapsed: 1.45s Traceback (most recent call last): File "/content/ldsc/ldsc.py", line 644, in sumstats.estimate_h2(args, log) File "/content/ldsc/ldscore/sumstats.py", line 326, in estimate_h2 args, log, args.h2) File "/content/ldsc/ldscore/sumstats.py", line 243, in _read_ld_sumstats ref_ld = _read_ref_ld(args, log) File "/content/ldsc/ldscore/sumstats.py", line 82, in _read_ref_ld 'reference panel LD Score', ps.ldscore_fromlist) File "/content/ldsc/ldscore/sumstats.py", line 152, in _read_chr_split_files out = parsefunc(_splitp(chr_arg), _N_CHR, **kwargs) File "/content/ldsc/ldscore/parse.py", line 103, in ldscore_fromlist y = ldscore(fh, num) File "/content/ldsc/ldscore/parse.py", line 147, in ldscore first_fh = sub_chr(fh, chrs[0]) + suffix IndexError: list index out of range

Getting the following error. Any help would be appreciated! Thanks.

  • LD Score Regression (LDSC)
  • Version 1.0.1
  • (C) 2014-2019 Brendan Bulik-Sullivan and Hilary Finucane
  • Broad Institute of MIT and Harvard / MIT Department of Mathematics
  • GNU General Public License v3

Call: ./ldsc.py --h2 pgc.sumstats.gz --ref-ld-chr eur_w_ld_chr --out pgc_h2 --w-ld-chr eur_w_ld_chr

Beginning analysis at Mon Dec 5 03:18:36 2022 Reading summary statistics from pgc.sumstats.gz ... Read summary statistics for 1086749 SNPs. Reading reference panel LD Score from eur_w_ld_chr[1-22] ... (ldscore_fromlist) Traceback (most recent call last): File "/content/ldsc/ldsc.py", line 644, in sumstats.estimate_h2(args, log) File "/content/ldsc/ldscore/sumstats.py", line 326, in estimate_h2 args, log, args.h2) File "/content/ldsc/ldscore/sumstats.py", line 243, in _read_ld_sumstats ref_ld = _read_ref_ld(args, log) File "/content/ldsc/ldscore/sumstats.py", line 82, in _read_ref_ld 'reference panel LD Score', ps.ldscore_fromlist) File "/content/ldsc/ldscore/sumstats.py", line 152, in _read_chr_split_files out = parsefunc(_splitp(chr_arg), _N_CHR, **kwargs) File "/content/ldsc/ldscore/parse.py", line 103, in ldscore_fromlist y = ldscore(fh, num) File "/content/ldsc/ldscore/parse.py", line 147, in ldscore first_fh = sub_chr(fh, chrs[0]) + suffix IndexError: list index out of range

Analysis finished at Mon Dec 5 03:18:38 2022 Total time elapsed: 1.45s Traceback (most recent call last): File "/content/ldsc/ldsc.py", line 644, in sumstats.estimate_h2(args, log) File "/content/ldsc/ldscore/sumstats.py", line 326, in estimate_h2 args, log, args.h2) File "/content/ldsc/ldscore/sumstats.py", line 243, in _read_ld_sumstats ref_ld = _read_ref_ld(args, log) File "/content/ldsc/ldscore/sumstats.py", line 82, in _read_ref_ld 'reference panel LD Score', ps.ldscore_fromlist) File "/content/ldsc/ldscore/sumstats.py", line 152, in _read_chr_split_files out = parsefunc(_splitp(chr_arg), _N_CHR, **kwargs) File "/content/ldsc/ldscore/parse.py", line 103, in ldscore_fromlist y = ldscore(fh, num) File "/content/ldsc/ldscore/parse.py", line 147, in ldscore first_fh = sub_chr(fh, chrs[0]) + suffix IndexError: list index out of range

CaoLuolong commented 1 year ago

Getting the following error. Any help would be appreciated! Thanks.

  • LD Score Regression (LDSC)
  • Version 1.0.1
  • (C) 2014-2019 Brendan Bulik-Sullivan and Hilary Finucane
  • Broad Institute of MIT and Harvard / MIT Department of Mathematics
  • GNU General Public License v3

Call: ./ldsc.py --h2 pgc.sumstats.gz --ref-ld-chr eur_w_ld_chr --out pgc_h2 --w-ld-chr eur_w_ld_chr

Beginning analysis at Mon Dec 5 03:18:36 2022 Reading summary statistics from pgc.sumstats.gz ... Read summary statistics for 1086749 SNPs. Reading reference panel LD Score from eur_w_ld_chr[1-22] ... (ldscore_fromlist) Traceback (most recent call last): File "/content/ldsc/ldsc.py", line 644, in sumstats.estimate_h2(args, log) File "/content/ldsc/ldscore/sumstats.py", line 326, in estimate_h2 args, log, args.h2) File "/content/ldsc/ldscore/sumstats.py", line 243, in _read_ld_sumstats ref_ld = _read_ref_ld(args, log) File "/content/ldsc/ldscore/sumstats.py", line 82, in _read_ref_ld 'reference panel LD Score', ps.ldscore_fromlist) File "/content/ldsc/ldscore/sumstats.py", line 152, in _read_chr_split_files out = parsefunc(_splitp(chr_arg), _N_CHR, **kwargs) File "/content/ldsc/ldscore/parse.py", line 103, in ldscore_fromlist y = ldscore(fh, num) File "/content/ldsc/ldscore/parse.py", line 147, in ldscore first_fh = sub_chr(fh, chrs[0]) + suffix IndexError: list index out of range

Analysis finished at Mon Dec 5 03:18:38 2022 Total time elapsed: 1.45s Traceback (most recent call last): File "/content/ldsc/ldsc.py", line 644, in sumstats.estimate_h2(args, log) File "/content/ldsc/ldscore/sumstats.py", line 326, in estimate_h2 args, log, args.h2) File "/content/ldsc/ldscore/sumstats.py", line 243, in _read_ld_sumstats ref_ld = _read_ref_ld(args, log) File "/content/ldsc/ldscore/sumstats.py", line 82, in _read_ref_ld 'reference panel LD Score', ps.ldscore_fromlist) File "/content/ldsc/ldscore/sumstats.py", line 152, in _read_chr_split_files out = parsefunc(_splitp(chr_arg), _N_CHR, **kwargs) File "/content/ldsc/ldscore/parse.py", line 103, in ldscore_fromlist y = ldscore(fh, num) File "/content/ldsc/ldscore/parse.py", line 147, in ldscore first_fh = sub_chr(fh, chrs[0]) + suffix IndexError: list index out of range

======================================= I think this is beacuse you have a loss of "/" in the workpath of "--ref-ld-chr", and you can try this: ./ldsc.py --h2 pgc.sumstats.gz --ref-ld-chr eur_w_ld_chr/ --w-ld-chr eur_w_ld_chr/ --out pgc_h2

CaoLuolong commented 1 year ago

======================================= I think this is beacuse you have a loss of "/" in the workpath of "--ref-ld-chr", and you can try this: ./ldsc.py --h2 pgc.sumstats.gz --ref-ld-chr eur_w_ld_chr/ --w-ld-chr eur_w_ld_chr/ --out pgc_h2

whippyhorse commented 1 year ago

I had the exact same error with the same string of line numbers and functions. It took me all morning today, but I eventually figured out that this error occurs simply if the file path for the LD scores is wrong. Any typos or mistakes in the file path will result in this error.

If the software authors read this: please give a more informative error message for this very common situation. I had to spend several hours of debugging to figure this out.

CaoLuolong commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。   您好,我已收到您的邮件,会尽快给您回复。

mmyc3 commented 1 year ago

Hi - many thanks for this really useful software. I am trying to perform a cell-type specific analysis using ATAC-seq peaks but also getting this error (see below). I have triple checked my file paths but still no luck.


Beginning analysis at Wed Nov 1 16:28:48 2023 Reading summary statistics from /home/mmchan/ldsc/sperm_1000g.v3.sumstats.gz ... Read summary statistics for 745571 SNPs. Reading reference panel LD Score from /home/mmchan/ldsc/1000_genomes/baselineLD.[1-22] ... (ldscore_fromlist) Read reference panel LD Scores for 1190321 SNPs. Removing partitioned LD Scores with zero variance. Reading regression weight LD Score from /home/mmchan/ldsc/1000_genomes/1000G_Phase3_weights_hm3_no_MHC/weights.hm3_noMHC.[1-22] ... (ldscore_fromlist) Read regression weight LD Scores for 1187349 SNPs. After merging with reference panel LD, 738750 SNPs remain. After merging with regression SNP LD, 736666 SNPs remain. Removed 0 SNPs with chi^2 > 123.793 (736666 SNPs remain) Reading cts reference panel LD Score from ./0_Peritubular_myoid_cells.[1-22].filtered.,./LDscore. ... (ldscore_fromlist) Traceback (most recent call last): File "/home/mmchan/ldsc/ldsc.py", line 646, in sumstats.cell_type_specific(args, log) File "/home/mmchan/ldsc/ldscore/sumstats.py", line 288, in cell_type_specific 'cts reference panel LD Score', ps.ldscore_fromlist) File "/home/mmchan/ldsc/ldscore/sumstats.py", line 152, in _read_chr_split_files out = parsefunc(_splitp(chr_arg), _N_CHR, **kwargs) File "/home/mmchan/ldsc/ldscore/parse.py", line 103, in ldscore_fromlist y = ldscore(fh, num) File "/home/mmchan/ldsc/ldscore/parse.py", line 147, in ldscore first_fh = sub_chr(fh, chrs[0]) + suffix IndexError: list index out of range

The format of the *.ldcts file is:

Celltype1 Celltype1.@.filtered.,LDscore.
Celltype2 Celltype2.@.filtered.,LDscore. Celltype3 Celltype3.@.filtered.,LDscore.

Any advice would be much appreciated! Many thanks, Melanie

whippyhorse commented 1 year ago

Adding to my comment from above: this error will occur in any situation where the program is not able to figure out the list of chromosomes in the LD score files. This will happen if the file path is wrong and it will also happen if the compression is not correct. I initially used .gz files for the LD score files but I could not get it to work. I think that the issue may be that some of the LD score files that the authors have available for download have a mismatch between the file extension and the file format. That is, they have a file extension of .gz or .tar but are not really compressed. I eventually got it to work by uncompressing all of the input files. You also have to pay close attention to both the file names and how you specify them in your call to ldsc.py. It is looking for a very specific format in the naming.