jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
374 stars 80 forks source link

01.run_assembly.pl. Program finished abnormally but Canu is apparently running correctly #493

Closed SergioMG97 closed 2 years ago

SergioMG97 commented 2 years ago

Hi! We are using your tool for making metagenomic analysis, and we have run several assemblies with flye, all is working perfectly. Now we decided to do some tests using canu just for comparing canu results with the flye ones using metaquast. I´ve checking several solved issues with canu-squeezemeta in clusters in order to review typical user difficulties. I disabled the grid, so the assembler runs properly. However squeezemeta raises an error:

 Running assembly with canu
mv: cannot stat '/mnt/lustre/scratch/nlsas/home/csic/nmy/vgc/dataANDanalysis/assemblies/inoculo/inoculofullseq/fullinoculohiacc/data/canu/fullinoculohiacc.contigs.fasta': No such file or directory
Error running command:    /mnt/netapp1/Store_CSIC/home/csic/nmy/vgc/.conda/envs/SqueezeMeta/SqueezeMeta/bin/canu-2.2/bin/canu  -p fullinoculohiacc -d /mnt/lustre/scratch/nlsas/home/csic/nmy/vgc/dataANDanalysis/assemblies/inoculo/inoculofullseq/fullinoculohiacc/data/canu genomeSize=5m corOutCoverage=10000 corMhapSensitivity=high corMinCoverage=0  maxThreads=64 maxMemory=99 -p inoculocanu maxThreads=60 maxMemory=99 corMinCoverage=0 corOutCoverage=all corMhapSensitivity=high correctedErrorRate=0.105 genomeSize=5m corMaxEvidenceCoverageLocal=10 corMaxEvidenceCoverageGlobal=10 oeaMemory=32 redMemory=32 batMemory=95 useGrid=false -nanopore-raw  /mnt/lustre/scratch/nlsas/home/csic/nmy/vgc/dataANDanalysis/assemblies/inoculo/inoculofullseq/fullinoculohiacc/data/raw_fastq/par1.fastq > /mnt/lustre/scratch/nlsas/home/csic/nmy/vgc/dataANDanalysis/assemblies/inoculo/inoculofullseq/fullinoculohiacc/syslog 2>&1;mv /mnt/lustre/scratch/nlsas/home/csic/nmy/vgc/dataANDanalysis/assemblies/inoculo/inoculofullseq/fullinoculohiacc/data/canu/fullinoculohiacc.contigs.fasta /mnt/lustre/scratch/nlsas/home/csic/nmy/vgc/dataANDanalysis/assemblies/inoculo/inoculofullseq/fullinoculohiacc/data/canu/contigs.fasta at /mnt/netapp1/Store_CSIC/home/csic/nmy/vgc/.conda/envs/SqueezeMeta/SqueezeMeta/scripts/01.run_assembly.pl line 119.
Stopping in STEP1 -> 01.run_assembly.pl. Program finished abnormally

We always run a test_install.pl when we allocate a job in a node with sbatch in order to check all is running correctly. I've been checking the syslog and apparently canu completes the assembly without errors and generates all the output structure. Thats why the error is confusing for me. When i checked the file structure that canu-squeezemeta outputs, the final assembly file is called inoculocanu.contigs.fasta , all the files are generated with the prefix "inoculocanu" in project/data/canu folder, but in the squeezemeta output there is a line that raises the error "No such file or directory" in the runassembly.pl crash, that line is trying to find "fullinoculohiacc.contigs.fasta" as the final assembly i guess. I don't really know if this could be producing the error cause i didn't know the code internally. My squeezemeta command is:

SqueezeMeta.pl -m sequential \
               -s ./inoculo.samples \
               -f ./samples \
               --minion \
               --canumem 99 \
               --assembly_options "-p inoculocanu maxThreads=60 corMinCoverage=0 corOutCoverage=all corMhapSensitivity=high correctedErrorRate=0.105 genomeSize=5m corMaxEvidenceCoverageLocal=10 corMaxEvidenceCoverageGlobal=10 useGrid=false" \
               -t 64 \
               -b 5 

The resources im asking for in the clusted for this job are: 64 cores and 128GB of memory One possibility is a typical memory conflict. But i ussually request several Gb more in order to solve this type of problem. I'm gonna send you a .zip file with the squeezemeta output (slurm-194340.out), the syslog associated, and my sbatch script if you want to check the commands i run and the outputs closely. SqueezeMeta issue.zip

Thanks in advance. Let me know if you need to know anything else.

jtamames commented 2 years ago

Hello Have you tried to remove the "-p inoculocanu" from assembly_options? Likely this error is a collision of different project names, the one you specify in the samples file (which I guess is "fullinoculohiacc") and the one you are passing to canu (which is "inucolocanu") Best, J

SergioMG97 commented 2 years ago

It makes sense, my bad. Now all the pipeline ran perfectly. Thank you so much Javier.

Best, Sergio