rvalieris / parallel-fastq-dump

parallel fastq-dump wrapper
MIT License
265 stars 33 forks source link

how to dealwith downloaded sra file #17

Closed zyxplus closed 5 years ago

zyxplus commented 5 years ago

"parallel-fastq-dump --sra-id SRR1219899 --threads 4 --outdir out/ --split-files --gzip" does it mean that this command line combines [prefetch] and [fastq-dump]? I haved downloaded the sra file, however?

rvalieris commented 5 years ago

when you use prefetch or fastq-dump the sra file is downloaded to ~/ncbi/public/sra/. if the file is there, it will be used next time you call fastq-dump again. the same is true for parallel-fastq-dump because it uses fastq-dump internally.

if you downloaded the file somewhere else you need to pass the path to the file as the argument.

timedreamer commented 5 years ago

The new version of prefetch (prefetch : 2.9.1 ( 2.9.1-1 )) can now specify output directory for SRA files. I think it would be good if parallel-fastq-dump can also specify the input directory for downloaded SRA files. Thanks!

rvalieris commented 5 years ago

hello @timedreamer, the arguments you give to parallel-fastq-dump are just re-passed to fastq-dump, you can you give it a path like /path/to/file.sra instead of the SRA id.

if you have many sra files in a directory you can do this: parallel-fastq-dump -s /path/to/*.sra

timedreamer commented 5 years ago

Awesome!

anwarMZ commented 4 years ago

hello @timedreamer, the arguments you give to parallel-fastq-dump are just re-passed to fastq-dump, you can you give it a path like /path/to/file.sra instead of the SRA id.

if you have many sra files in a directory you can do this: parallel-fastq-dump -s /path/to/*.sra

Hi, Just wanted to ask In this case parallel-fastq-dump -s /path/to/*.sra, how does -t argument works? Are multiple files processed simultaneously on different threads OR still one file at a time but in parallel?

rvalieris commented 4 years ago

@anwarMZ , it always processes a single SRA at a time in parallel according to -t. if you have a bunch of small SRA files it might make sense to process multiple SRA files simultaneously, you can use GNU parallel for that.