Closed MEFarhadieh closed 2 years ago
hello,
parallel-fastq-dump
is using fastq-dump
under the hood, and it already dumps technical reads by default, so you should be able to get what you want just by using the --split-files
argument I think.
to not get technical reads, you need to use --skip-technical
if you have a example SRR you are interested I can look into it better.
Thank you so much for your quick reply.
I ran parallel-fastq-dump --sra-id SRR11422712 --threads 10 --split-files --gzip
for SRR11422712, which contains 3 reads per spot. However, I got one fastq.gz file that only includes biological reads.
I also, used prefetch
and fastq-dump
with --split-files
argument and again I got one fastq file.
Finally, I ran fasterq-dump ~/SRR11422712/SRR11422712.sra --include-technical --split-files
and got three fastq files include sample index, UMIs, and biological reads.
thats weird, I tested with this same SRR and got 3 files, _1 and _2 with technical reads and _3 with biological reads.
make sure you are using the latest versions of parallel-fast-dump and sra-tools:
$ parallel-fastq-dump --version
parallel-fastq-dump : 0.6.7
"fastq-dump" version 2.11.0
also, check the logs printed on the screen while it runs, there might be some warning/error message.
There is neither warning nor error, and my parallel-fastq-dump version is 0..6.7. However, I found 2 fastq-dump versions in my $PATH. version 2.11.0 in Miniconda bin and version 3.0.0 in usr local bin. I removed version 3.0.0 and tried again, but I got same. I will reinstall and reconfigure packages, and comment result.
I'm so sorry for this and appreciate for your support.
try this command:
fastq-dump --split-files --gzip -N 1 -X 1000 SRR11422712
or
fastq-dump --split-files --gzip -N 1 -X 1000 ~/SRR11422712/SRR11422712.sra
parallel-fastq-dump is running fastq-dump commands like this is the background, if you don't get 3 files from this command theres something wrong with sra-tools.
After I had uninstalled and removed sra tool kit 3.0.0 and reinstalled 2.11.0 by conda, fastq-dump --split-files --gzip -N 1 -X 1000 ~/SRR11422712/SRR11422712.sra
returned 3 files correctly. Also parallel-fastq-dump
.
I guess that was due to sra 3.0.0.
Thank you so much for all your help.
nice, I'm glad its working now, but its weird that version 3.0.0 gives a different result, you might want to followup on this with the sra-tools authors.
Thanks for this great tool! Can I use
--include-technical
flag inparallel-fastq-dump
command, like fasterq-dump to make a separate file for UMI reads of single cell sra?