rvalieris / parallel-fastq-dump

parallel fastq-dump wrapper
MIT License
265 stars 33 forks source link

IndexError: list index out of range #36

Closed nick-youngblut closed 3 years ago

nick-youngblut commented 3 years ago
$ parallel-fastq-dump -s ERS1444621 --outdir pfd_output --tmpdir pfd_tmp
SRR ids: ['ERS1444621']
extra args: []
tempdir: pfd_tmp/pfd___5awz6o
2021-04-11T06:28:36 sra-stat.2.8.2 int: directory not found while opening manager within virtual file system module - 'ERS1444621'
Traceback (most recent call last):
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmgqc/.snakemake/conda/af0470e0f4e49441f0a6d5af028c9398/bin/parallel-fastq-dump", line 119, in <module>
    main()
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmgqc/.snakemake/conda/af0470e0f4e49441f0a6d5af028c9398/bin/parallel-fastq-dump", line 112, in main
    pfd(args, si, extra_args)
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmgqc/.snakemake/conda/af0470e0f4e49441f0a6d5af028c9398/bin/parallel-fastq-dump", line 15, in pfd
    n_spots = get_spot_count(srr_id)
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmgqc/.snakemake/conda/af0470e0f4e49441f0a6d5af028c9398/bin/parallel-fastq-dump", line 65, in get_spot_count
    total += int(l.split("|")[2].split(":")[0])
IndexError: list index out of range

The output from subprocess.Popen(["sra-stat", "--meta", "--quick", sra_id], stdout=subprocess.PIPE) is just [''], which is causing the IndexError

nick-youngblut commented 3 years ago

I'm getting similar for the README example:

$ parallel-fastq-dump --sra-id SRR1219899 --threads 4 --outdir out/ --split-files --gzip
SRR ids: ['SRR1219899']
extra args: ['--split-files', '--gzip']
tempdir: /tmp/pfd_u9jt71li
2021-04-11T06:57:20 sra-stat.2.8.2 err: query unauthorized while resolving query within virtual file system module - failed to resolve accession 'SRR1219899' - Access denied - please request permission to access phs000710/UR in dbGaP ( 403 )
2021-04-11T06:57:21 sra-stat.2.8.2 err: query unauthorized while resolving tree within virtual file system module - failed to resolve accession 'SRR1219899' - Access denied - please request permission to access phs000710/UR in dbGaP ( 403 )
2021-04-11T06:57:21 sra-stat.2.8.2 err: query unauthorized while resolving query within virtual file system module - failed to resolve accession 'SRR1219899' - Access denied - please request permission to access phs000710/UR in dbGaP ( 403 )
2021-04-11T06:57:22 sra-stat.2.8.2 err: query unauthorized while resolving query within virtual file system module - failed to resolve accession 'SRR1219899' - Access denied - please request permission to access phs000710/UR in dbGaP ( 403 )
2021-04-11T06:57:22 sra-stat.2.8.2 err: query unauthorized while resolving tree within virtual file system module - failed to resolve accession 'SRR1219899' - Access denied - please request permission to access phs000710/UR in dbGaP ( 403 )
2021-04-11T06:57:23 sra-stat.2.8.2 err: query unauthorized while resolving query within virtual file system module - failed to resolve accession 'SRR1219899' - Access denied - please request permission to access phs000710/UR in dbGaP ( 403 )
2021-04-11T06:57:23 sra-stat.2.8.2 int: directory not found while opening manager within virtual file system module - 'SRR1219899'
Traceback (most recent call last):
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmgqc/.snakemake/conda/af0470e0f4e49441f0a6d5af028c9398/bin/parallel-fastq-dump", line 121, in <module>
    main()
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmgqc/.snakemake/conda/af0470e0f4e49441f0a6d5af028c9398/bin/parallel-fastq-dump", line 114, in main
    pfd(args, si, extra_args)
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmgqc/.snakemake/conda/af0470e0f4e49441f0a6d5af028c9398/bin/parallel-fastq-dump", line 16, in pfd
    n_spots = get_spot_count(srr_id)
  File "/ebio/abt3_projects/software/dev/ll_pipelines/llmgqc/.snakemake/conda/af0470e0f4e49441f0a6d5af028c9398/bin/parallel-fastq-dump", line 67, in get_spot_count
    total += int(l.split("|")[2].split(":")[0])
IndexError: list index out of range

Edit: I realized that it's a permissions issue for that particular accession (as described in https://github.com/rvalieris/parallel-fastq-dump/issues/23). Maybe it would be helpful to change the example to an accession that doesn't require obtaining special permissions (eg., SRR2244401)?

rvalieris commented 3 years ago

Hi, yeah this seems to be a common source of confusion, changing it to a open dataset seems like a good idea.