HRGV / phyloFlash

phyloFlash - A pipeline to rapidly reconstruct the SSU rRNAs and explore phylogenetic composition of an illumina (meta)genomic dataset.
GNU General Public License v3.0
75 stars 25 forks source link

phyloFlash failed at spades step #144

Closed songweizhi closed 3 years ago

songweizhi commented 3 years ago

Hi,

Thanks for developing this great tool.

My jobs aborted at the spades step. Could you please help to check where the problem is? I have attached the log file. MBARC26_phyloFlash.spades.out.txt

BTW, I noticed that only 20 Gb memory is allocated to spades by default. is it possible to make this setting customizable?

Thanks, Weizhi

kbseah commented 3 years ago

Thanks for reporting this issue. Could you attach the main phyloFlash log file, and the command line that you used?

From the filenames, looks like the input file was in Fasta format, and SPAdes failed because it could not determine the quality scores. Do you have the original Fastq data for this sequencing library?

songweizhi commented 3 years ago

Thanks a lot for your quick reply.

Yes, you are right, my input reads are in fasta format. I noticed from the manual that phyloFlash takes both fasta and fastq files as input, right?

In addition to making memory allocated to spades customizable, can you please also include the "--only-assembler" argument to spades command if input reads are in fasta format?

Thanks in advance, Weizhi

My command: export PHYLOFLASH_DBHOME=/srv/scratch/z5039045/Softwares/phyloFlash-pf3.4/138.1 export PATH=/srv/scratch/z5039045/Softwares/phyloFlash-pf3.4:$PATH phyloFlash.pl -lib MBARC26_phyloFlash -CPUs 12 -almosteverything -read1 ../MBARC26_R1.fasta -read2 ../MBARC26_R2.fasta

MBARC26_phyloFlash.phyloFlash.log

kbseah commented 3 years ago

Good point, the statement about Fasta and Fastq inputs was added a long time ago, but we've done most of our testing with Fastq input so haven't noticed this issue until now.

Thanks for the suggestion on the --only-assembler option in SPAdes. It might take some time until we implement a fix. In the meanwhile, you could try running the SPAdes command separately. The extracted rRNA reads should be among the output files.

songweizhi commented 3 years ago

Thanks a lot for your suggestion, I do have fastq files on hand, so will try with them first.

Cheers, Weizhi