Hydro3639 / NanoPhase

Reference-quality genome reconstruction from complex metagenomes (or bacterial isolates) using only Nanopore long reads or both long and short reads (hybrid strategy)
MIT License
26 stars 1 forks source link

Issue with metabat2 #13

Open starkeynight opened 1 month ago

starkeynight commented 1 month ago

Hello, I am attempting to run the test data, and it fails at the metabat2 stage, with the following message:

nanophase meta -l mnt/databases/db/fastq/SRR17913199.fastq -t 24 -o nanophase-out
[2024-10-10 16:13:01] INFO: nanophase (meta) starts
[2024-10-10 16:13:01] INFO: Command line: /miniconda/envs/nanophase/bin/nanophase meta -l mnt/databases/db/fastq/SRR17913199.fastq -t 24 -o nanophase-out
[2024-10-10 16:13:01] INFO: long_read_only model was selected, only Nanopore long reads will be used
[2024-10-10 16:13:01] CHECK: Nanopore long-read (fastq) file has been found
[2024-10-10 16:13:01] CHECK: Check software availability and locations
[2024-10-10 16:13:01] INFO: The following packages have been found
#package             location
nanophase            /miniconda/envs/nanophase/bin/nanophase
flye                 /miniconda/envs/nanophase/bin/flye
metabat2             /miniconda/envs/nanophase/bin/metabat2
maxbin2              /miniconda/envs/nanophase/bin/run_MaxBin.pl
SemiBin              /miniconda/envs/nanophase/bin/SemiBin
metawrap             /miniconda/envs/nanophase/bin/metawrap
checkm               /miniconda/envs/nanophase/bin/checkm
racon                /miniconda/envs/nanophase/bin/racon
medaka               /miniconda/envs/nanophase/bin/medaka
polypolish           /miniconda/envs/nanophase/bin/polypolish
POLCA                /miniconda/envs/nanophase/bin/polca.sh
bwa                  /miniconda/envs/nanophase/bin/bwa
seqtk                /miniconda/envs/nanophase/bin/seqtk
minimap2             /miniconda/envs/nanophase/bin/minimap2
BBMap                /miniconda/envs/nanophase/bin/BBMap
parallel             /miniconda/envs/nanophase/bin/parallel
perl                 /miniconda/envs/nanophase/bin/perl
samtools             /miniconda/envs/nanophase/bin/samtools
gtdbtk               /miniconda/envs/nanophase/bin/gtdbtk
fastANI              /miniconda/envs/nanophase/bin/fastANI
All required packages have been found in the environment. If the above certain packages integrated into nanophase were used in your investigation, please give them credit as well :)
[2024-10-10 16:13:01] INFO: long-read assembly has been found in the folder: nanophase-out/01-LongAssemblies/. Now go to the next stage: generating LongBins...
        Note: please ensure flye assembly finished successfully in the previous run, if not, please remove this folder using the command 'rm -rf nanophase-out/nanophase-out/01-LongAssemblies/' and re-run nanophase command
[2024-10-10 16:13:01] INFO: metabat2 binning re-starts
/miniconda/envs/nanophase/bin/nanophase.meta: line 251: 11348 Segmentation fault      metabat2 -t $N_threads -i $OutDIR/01-LongAssemblies/assembly.fasta -o $OutDIR/02-LongBins/INITIAL_BINNING/metabat2/metabat2-bins/bin -a $OutDIR/02-LongBins/INITIAL_BINNING/metabat2/metabat2_abun.txt --cvExt > $OutDIR/02-LongBins/INITIAL_BINNING/metabat2/bin.log
[2024-10-10 16:13:02] ERROR: Something wrong with metabat2 binning, please also check nanophase-out/02-LongBins/INITIAL_BINNING/metabat2/bin.log, terminating...

The assembly outputs seem fine. I have attempted this with several different values for number of threads, but no success. Any advice would be greatly appreciated!

Hydro3639 commented 1 month ago

Thanks for trying nanophase.

This might be related to a memory issue. Could you provide the memory specifications of the server you are currently running it on, as well as the operating system?

starkeynight commented 1 month ago

I am attempting to run it in a docker container on a Windows PC. The container is set up as a Linux system, and it has 32GB of RAM.

Hydro3639 commented 1 month ago

That might be the issue. You can refer to this link — hopefully, it will help! :)

I'm not sure what type of sample you will work with, but generally, 32GB of memory might not be sufficient for handling complex metagenomic sequencing datasets.