Hydro3639 / NanoPhase

Reference-quality genome reconstruction from complex metagenomes (or bacterial isolates) using only Nanopore long reads or both long and short reads (hybrid strategy)
MIT License
24 stars 1 forks source link

semibin/semibin.log, terminating #8

Closed kubradilek closed 1 year ago

kubradilek commented 1 year ago

Hi,

I tried to use tool by applying the dataset in the tutorial, however I got an below error, could you help me ?

[2023-08-04 13:06:06] INFO: nanophase (meta) starts [2023-08-04 13:06:06] INFO: Command line: /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/nanophase meta -l lr.fa.gz -t 40 -o ont-nanophase-out [2023-08-04 13:06:06] INFO: long_read_only model was selected, only Nanopore long reads will be used [2023-08-04 13:06:06] CHECK: Nanopore long-read (fa.gz) file has been found [2023-08-04 13:06:06] CHECK: Check software availability and locations [2023-08-04 13:06:07] INFO: The following packages have been found

package location

nanophase /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/nanophase flye /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/flye metabat2 /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/metabat2 maxbin2 /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/run_MaxBin.pl SemiBin /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/SemiBin metawrap /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/metawrap checkm /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/checkm racon /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/racon medaka /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/medaka polypolish /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/polypolish POLCA /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/polca.sh bwa /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/bwa seqtk /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/seqtk minimap2 /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/minimap2 BBMap /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/BBMap parallel /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/parallel perl /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/perl samtools /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/samtools gtdbtk /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/gtdbtk fastANI /gpfs01/home/alykb3/.conda/envs/nanophase_env/bin/fastANI All required packages have been found in the environment. If the above certain packages integrated into nanophase were used in your investigation, please give them credit as well :) [2023-08-04 13:06:07] TASK: Long-read assembly starts (be patient) [2023-08-04 13:14:58] DONE: long-read assembly finished sucessfully: detailed log file is ont-nanophase-out/01-LongAssemblies/flye.log [2023-08-04 13:14:58] TASK: Initial binning::metabat2 binning starts [2023-08-04 13:14:59] DONE: Initial binning::metabat2 binning finished sucessfully MetaBAT 2 (v2.12.1) using minContig 2500, minCV 1.0, minCVSum 1.0, maxP 95%, minS 60, and maxEdges 200. 1 bins (2028321 bases in total) formed. [2023-08-04 13:14:59] TASK: Initial binning::maxbin2 binning starts [2023-08-04 13:15:12] DONE: Initial binning::maxbin2 binning finished sucessfully Yielded 2 bins for contig (scaffold) file ont-nanophase-out/01-LongAssemblies/assembly.fasta Here are the output files for this run. Please refer to the README file for further details. Summary file: ont-nanophase-out/02-LongBins/INITIAL_BINNING/maxbin2/bin.summary Marker counts: ont-nanophase-out/02-LongBins/INITIAL_BINNING/maxbin2/bin.marker Marker genes for each bin: ont-nanophase-out/02-LongBins/INITIAL_BINNING/maxbin2/bin.marker_of_each_gene.tar.gz [2023-08-04 13:15:12] TASK: Initial binning::SemiBin binning starts [2023-08-04 13:16:39] ERROR: Something wrong with SemiBin binning, please also check ont-nanophase-out/02-LongBins/INITIAL_BINNING/semibin/semibin.log, terminating...

kubradilek commented 1 year ago

I handled this problem reinstalling v0.2.2 however I got this error 'ERROR: Something wrong with GTDB::Taxa process, terminating'

although I did these

download database: May skip if you have done before or GTDB and PLSDB have been downloaded in the server

wget https://data.gtdb.ecogenomic.org/releases/latest/auxillary_files/gtdbtk_v2_data.tar.gz && tar xvzf gtdbtk_v2_data.tar.gz wget https://ccb-microbe.cs.uni-saarland.de/plsdb/plasmids/download/plsdb.fna.bz2 && bunzip2 plsdb.fna.bz2 conda activate nanophase

setting location

echo "export GTDBTK_DATA_PATH=/path/to/release/package/" > $(dirname $(dirname which nanophase))/etc/conda/activate.d/np_db.sh

Change /path/to/release/package/ to the real location where you stored the GTDB

echo "export PLSDB_PATH=/path/to/plsdb.fna" >> $(dirname $(dirname which nanophase))/etc/conda/activate.d/np_db.sh

Change /path/to/plsdb.fna to the real location where you stored the PLSDB

conda deactivate && conda activate nanophase ## require re-activate nanophase

Hydro3639 commented 1 year ago

Hello,

I would strongly suggest to use v0.2.3, please see here.

The example dataset was too small and had semibin error; sorry about this. Could you try nanophase v0.2.3 with a large dataset, e.g., SRR17913199 (you can download it via: fastq-dump SRR17913199; more details about this dataset can be found in our manuscript). Let me know if you had some unexpected errors:)

kubradilek commented 1 year ago

Hi,

Thank you so much for replying, I sort it out, however, I got new warning like 'ERROR: Something wrong with GTDB::Taxa process, terminating', although I followed the instructions about exporting gtdbtk database to path.

Hydro3639 commented 1 year ago

May I know your nanophase and gtdbtk versions?

kubradilek commented 1 year ago

Nanophase version is V0.2.2, GTDB-Tk v2.3.2

kubradilek commented 1 year ago

[2023-08-07 00:23:48] TASK: Genome taxa classification starts cat: ont-nanophase-out/03-Polishing/Final-bins/tmp/gtdbtk.log: No such file or directory

[2023-08-07 00:23:50] DONE: genome classification done cat: ont-nanophase-out/03-Polishing/Final-bins/tmp/classify/gtdbtk.*summary.tsv: No such file or directory [2023-08-07 00:23:50] ERROR: Something wrong with GTDB::Taxa process, terminating...

Hydro3639 commented 1 year ago

Could you try v0.2.3? Because gtdbtk adds new features and needs to change the default command line. (I will consider adding it in the next release)

kubradilek commented 1 year ago

Hi, Thank you so much for helping me, I sorted it out and got the result :)

Finally, DONE: nanophase finished and have a nice day!