jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
374 stars 80 forks source link

Stopping in STEP1 Died at /home/user/miniconda3/envs/squeezemeta/bin/SqueezeMeta.pl line 941 #887

Closed MSJennyLee closed 1 month ago

MSJennyLee commented 1 month ago

Hello, I'm running SqueezeMeta to analyze 36 metagenomes (6G each) using a coassembly mode and it failed at STEP1 after about 13 hours. The platform information is as follows: Ubuntu 22.04.2 LTS $conda --version conda 23.3.1 $ SqueezeMeta.pl -v


1.6.2, March 2023

The syslog of the failed task is: Run started Thu Aug 29 13:07:04 2024 in coassembly mode

SqueezeMeta v1.6.2, March 2023 - (c) J. Tamames, F. Puente-Sánchez CNB-CSIC, Madrid, SPAIN

Please cite: Tamames & Puente-Sanchez, Frontiers in Microbiology 10.3389 (2019). doi: https://doi.org/10.3389/fmicb.2018.03349

Run started for hnmeta1, Thu Aug 29 13:07:04 2024 Project: hnmeta1 Map file: sampleID Fastq directory: /home/bio508/NewDisk/gs/hn_meta_raw Command: /home/bio508/miniconda3/envs/squeezemeta/bin/SqueezeMeta.pl -m coassembly -p hnmeta1 -s sampleID -f hn_meta_raw -miniden 50 -t 60 -taxbinmode c [0 seconds]: STEP0 -> SqueezeMeta.pl COGS; KEGG; PFAM;

[0 seconds]: STEP1 -> 01.run_all_assemblies.pl (megahit) Stopping in STEP1 -> 01.run_all_assemblies.pl. Program finished abnormally


System information:


Tree for the project: 08/NewDisk/gs/hn_meta_raw/DK1_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK2_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK3_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK4_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK6_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK7_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK8_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DX1_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DX2_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DX3_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DX4_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/HNA2_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/LK1_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/LK2_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/LK4_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/LK5_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/LK8_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/NGW3_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SL8_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SST2_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SST3_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SST4_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SST5_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SZY2_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SZY3_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/WG1_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/WG4_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD2_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD4_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD5_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD6_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD7_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD8_R1.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XYW3_R1.fastq.gz > /home/bio508/NewDisk/gs/hnmeta1/data/raw_fastq/par1.fastq.gz Preparing files for pair2: cat /home/bio508/NewDisk/gs/hn_meta_raw/CZ3_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/CZ4_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK1_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK2_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK3_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK4_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK6_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK7_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DK8_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DX1_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DX2_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DX3_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/DX4_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/HNA2_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/LK1_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/LK2_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/LK4_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/LK5_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/LK8_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/NGW3_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SL8_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SST2_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SST3_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SST4_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SST5_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SZY2_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/SZY3_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/WG1_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/WG4_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD2_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD4_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD5_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD6_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD7_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XJD8_R2.fastq.gz /home/bio508/NewDisk/gs/hn_meta_raw/XYW3_R2.fastq.gz > /home/bio508/NewDisk/gs/hnmeta1/data/raw_fastq/par2.fastq.gz Running assembly with megahit: perl /home/bio508/miniconda3/envs/squeezemeta/SqueezeMeta/lib/SqueezeMeta/assembly_megahit.pl /home/bio508/NewDisk/gs/hnmeta1 hnmeta1 /home/bio508/NewDisk/gs/hnmeta1/data/raw_fastq/par1.fastq.gz /home/bio508/NewDisk/gs/hnmeta1/data/raw_fastq/par2.fastq.gz 2024-08-29 13:43:58 - MEGAHIT v1.2.9 2024-08-29 13:43:58 - Using megahit_core with POPCNT and BMI2 support 2024-08-29 13:43:58 - Convert reads to binary library 2024-08-29 14:27:20 - b'INFO sequence/io/sequence_lib.cpp : 77 - Lib 0 (/home/bio508/NewDisk/gs/hnmeta1/data/raw_fastq/par1.fastq.gz,/home/bio508/NewDisk/gs/hnmeta1/data/raw_fastq/par2.fastq.gz): pe, 1715377916 reads, 151 max length' 2024-08-29 14:27:20 - b'INFO utils/utils.h : 152 - Real: 2601.6937\tuser: 1868.8544\tsys: 516.4870\tmaxrss: 242500' 2024-08-29 14:27:20 - k-max reset to: 141 2024-08-29 14:27:20 - Start assembly. Number of CPU threads 60 2024-08-29 14:27:20 - k list: 21,29,39,59,79,99,119,141 2024-08-29 14:27:20 - Memory used: 364830783897 2024-08-29 14:27:20 - Extract solid (k+1)-mers for k = 21 2024-08-29 16:09:31 - Build graph for k = 21 2024-08-29 21:57:46 - Assemble contigs from SdBG for k = 21 2024-08-30 02:08:17 - Error occurs, please refer to /home/bio508/NewDisk/gs/hnmeta1/data/megahit/log for detail 2024-08-30 02:08:17 - Command: /home/bio508/miniconda3/envs/squeezemeta/SqueezeMeta/bin/megahit/megahit_core assemble -s /home/bio508/NewDisk/gs/hnmeta1/data/megahit/tmp/k21/21 -o /home/bio508/NewDisk/gs/hnmeta1/data/megahit/intermediate_contigs/k21 -t 60 --min_standalone 300 --prune_level 2 --merge_len 20 --merge_similar 0.95 --cleaning_rounds 5 --disconnect_ratio 0.1 --low_local_ratio 0.2 --cleaning_rounds 5 --min_depth 2 --bubble_level 2 --max_tip_len -1 --careful_bubble; Exit code -9 Linux sky 6.2.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Jul 13 16:27:29 UTC 2 x86_64 x86_64 x86_64 GNU/Linux [4.0K Aug 29 13:07] /home/bio508/NewDisk/gs/hnmeta1 ├── [ 31 Aug 29 13:07] creator.txt ├── [ 0 Aug 29 13:43] data │   ├── [1.9K Aug 29 13:07] 00.hnmeta1.samples │   ├── [4.0K Aug 29 13:43] megahit │   │   ├── [ 28 Aug 29 21:57] checkpoints.txt │   │   ├── [ 0 Aug 29 13:43] intermediate_contigs │   │   ├── [1.4M Aug 30 02:08] log │   │   ├── [ 949 Aug 29 13:43] options.json │   │   └── [4.0K Aug 29 14:27] tmp │   │   ├── [ 28K Aug 29 21:57] k21 │   │   │   ├── [508K Aug 29 16:09] 21.counting │   │   │   ├── [1.6M Aug 29 16:09] 21.edges.info │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.0 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.1 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.10 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.11 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.12 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.13 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.14 │   │   │   ├── [3.4G Aug 29 21:56] 21.sdbg.15 │   │   │   ├── [3.1G Aug 29 21:56] 21.sdbg.16 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.17 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.18 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.19 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.2 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.20 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.21 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.22 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.23 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.24 │   │   │   ├── [3.1G Aug 29 21:56] 21.sdbg.25 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.26 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.27 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.28 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.29 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.3 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.30 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.31 │   │   │   ├── [3.4G Aug 29 21:56] 21.sdbg.32 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.33 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.34 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.35 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.36 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.37 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.38 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.39 │   │   │   ├── [3.4G Aug 29 21:56] 21.sdbg.4 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.40 │   │   │   ├── [3.1G Aug 29 21:56] 21.sdbg.41 │   │   │   ├── [3.1G Aug 29 21:56] 21.sdbg.42 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.43 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.44 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.45 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.46 │   │   │   ├── [3.0G Aug 29 21:56] 21.sdbg.47 │   │   │   ├── [3.1G Aug 29 21:56] 21.sdbg.48 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.49 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.5 │   │   │   ├── [3.1G Aug 29 21:56] 21.sdbg.50 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.51 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.52 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.53 │   │   │   ├── [3.1G Aug 29 21:56] 21.sdbg.54 │   │   │   ├── [3.1G Aug 29 21:56] 21.sdbg.55 │   │   │   ├── [3.1G Aug 29 21:56] 21.sdbg.56 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.57 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.58 │   │   │   ├── [3.1G Aug 29 21:56] 21.sdbg.59 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.6 │   │   │   ├── [3.3G Aug 29 21:56] 21.sdbg.7 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.8 │   │   │   ├── [3.2G Aug 29 21:56] 21.sdbg.9 │   │   │   └── [2.2M Aug 29 21:56] 21.sdbg_info │   │   ├── [ 249 Aug 29 13:43] reads.lib │   │   ├── [ 69G Aug 29 14:27] reads.lib.bin │   │   └── [ 165 Aug 29 14:27] reads.lib.lib_info │   └── [ 0 Aug 29 13:25] raw_fastq │   ├── [ 52G Aug 29 13:25] par1.fastq.gz │   └── [ 54G Aug 29 13:43] par2.fastq.gz ├── [ 0 Aug 29 13:07] ext_tables ├── [ 0 Aug 29 13:07] intermediate │   └── [ 0 Aug 29 13:07] binners ├── [ 117 Aug 29 13:07] methods.txt ├── [3.1K Aug 29 13:07] parameters.pl ├── [ 37 Aug 29 13:07] progress ├── [ 0 Aug 29 13:07] results ├── [8.3K Aug 29 13:07] SqueezeMeta_conf.pl ├── [6.4K Aug 30 02:08] syslog └── [ 0 Aug 29 13:07] temp

12 directories, 78 files

The samples list are provided in a file named sampleID without suffix. The sampleID file was prepared by input necessary information in three separate columns of an Excel sheet, then copy and paste in a visual studio code window and then copy and paste in a txt document and saved. The sampleID file looks like this: CZ3 CZ3_R1.fastq.gz pair1 CZ3 CZ3_R2.fastq.gz pair2 CZ4 CZ4_R1.fastq.gz pair1 CZ4 CZ4_R2.fastq.gz pair2 DK1 DK1_R1.fastq.gz pair1 DK1 DK1_R2.fastq.gz pair2 DK2 DK2_R1.fastq.gz pair1 DK2 DK2_R2.fastq.gz pair2 DK3 DK3_R1.fastq.gz pair1 DK3 DK3_R2.fastq.gz pair2 DK4 DK4_R1.fastq.gz pair1 DK4 DK4_R2.fastq.gz pair2 DK6 DK6_R1.fastq.gz pair1 DK6 DK6_R2.fastq.gz pair2 DK7 DK7_R1.fastq.gz pair1 DK7 DK7_R2.fastq.gz pair2 LK1 LK1_R1.fastq.gz pair1 LK1 LK1_R2.fastq.gz pair2 LK2 LK2_R1.fastq.gz pair1 LK2 LK2_R2.fastq.gz pair2 LK5 LK5_R1.fastq.gz pair1 LK5 LK5_R2.fastq.gz pair2 SL8 SL8_R1.fastq.gz pair1 SL8 SL8_R2.fastq.gz pair2 WG1 WG1_R1.fastq.gz pair1 WG1 WG1_R2.fastq.gz pair2 WG4 WG4_R1.fastq.gz pair1 WG4 WG4_R2.fastq.gz pair2 DK8 DK8_R1.fastq.gz pair1 DK8 DK8_R2.fastq.gz pair2 DX1 DX1_R1.fastq.gz pair1 DX1 DX1_R2.fastq.gz pair2 DX2 DX2_R1.fastq.gz pair1 DX2 DX2_R2.fastq.gz pair2 DX3 DX3_R1.fastq.gz pair1 DX3 DX3_R2.fastq.gz pair2 DX4 DX4_R1.fastq.gz pair1 DX4 DX4_R2.fastq.gz pair2 LK4 LK4_R1.fastq.gz pair1 LK4 LK4_R2.fastq.gz pair2 LK8 LK8_R1.fastq.gz pair1 LK8 LK8_R2.fastq.gz pair2 XJD2 XJD2_R1.fastq.gz pair1 XJD2 XJD2_R2.fastq.gz pair2 XJD4 XJD4_R1.fastq.gz pair1 XJD4 XJD4_R2.fastq.gz pair2 XJD5 XJD5_R1.fastq.gz pair1 XJD5 XJD5_R2.fastq.gz pair2 XJD6 XJD6_R1.fastq.gz pair1 XJD6 XJD6_R2.fastq.gz pair2 XJD7 XJD7_R1.fastq.gz pair1 XJD7 XJD7_R2.fastq.gz pair2 XJD8 XJD8_R1.fastq.gz pair1 XJD8 XJD8_R2.fastq.gz pair2 HNA2 HNA2_R1.fastq.gz pair1 HNA2 HNA2_R2.fastq.gz pair2 NGW3 NGW3_R1.fastq.gz pair1 NGW3 NGW3_R2.fastq.gz pair2 SST2 SST2_R1.fastq.gz pair1 SST2 SST2_R2.fastq.gz pair2 SST3 SST3_R1.fastq.gz pair1 SST3 SST3_R2.fastq.gz pair2 SST4 SST4_R1.fastq.gz pair1 SST4 SST4_R2.fastq.gz pair2 SST5 SST5_R1.fastq.gz pair1 SST5 SST5_R2.fastq.gz pair2 SZY2 SZY2_R1.fastq.gz pair1 SZY2 SZY2_R2.fastq.gz pair2 SZY3 SZY3_R1.fastq.gz pair1 SZY3 SZY3_R2.fastq.gz pair2 XYW3 XYW3_R1.fastq.gz pair1 XYW3 XYW3_R2.fastq.gz pair2

"test_install.pl test“ showed that all checks successful: $/home/bio508/miniconda3/envs/squeezemeta/SqueezeMeta/utils/install_utils/test_install.pl

Checking the OS linux OK

Checking that tree is installed tree --help OK

Checking that ruby is installed ruby -h OK

Checking that java is installed java -h OK

Checking that all the required perl libraries are available in this environment perl -e 'use Term::ANSIColor' OK perl -e 'use DBI' OK perl -e 'use DBD::SQLite::Constants' OK perl -e 'use Time::Seconds' OK perl -e 'use Tie::IxHash' OK perl -e 'use Linux::MemInfo' OK perl -e 'use Getopt::Long' OK perl -e 'use File::Basename' OK perl -e 'use DBD::SQLite' OK perl -e 'use Data::Dumper' OK perl -e 'use Cwd' OK perl -e 'use XML::LibXML' OK perl -e 'use XML::Parser' OK perl -e 'use Term::ANSIColor' OK

Checking that all the required python libraries are available in this environment python3 -h OK python3 -c 'import numpy' OK python3 -c 'import scipy' OK python3 -c 'import matplotlib' OK python3 -c 'import dendropy' OK python3 -c 'import pysam' OK python3 -c 'import Bio.Seq' OK python3 -c 'import pandas' OK python3 -c 'import sklearn' OK python3 -c 'import nose' OK python3 -c 'import cython' OK python3 -c 'import future' OK

Checking that all the required R libraries are available in this environment R -h OK R -e 'library(doMC)' OK R -e 'library(ggplot2)' OK R -e 'library(data.table)' OK R -e 'library(reshape2)' OK R -e 'library(pathview)' OK R -e 'library(DASTool)' OK R -e 'library(SQMtools)' OK

Checking that SqueezeMeta is properly configured... checking database in /home/bio508/NewDisk/database/Squeezemeta/db nr.db OK CheckM manifest OK LCA_tax DB OK

All checks successful

After the failure running with my data, I'm now using the test data provided with the SqueezeMeta pipeline and everything seems fine. It's now comes to Step 4。 The syslog of running test data: (base) bio508@sky:~/NewDisk/database/Squeezemeta/test$ conda activate squeezemeta (squeezemeta) bio508@sky:~/NewDisk/database/Squeezemeta/test$ SqueezeMeta.pl -m coassembly -p test -s test.samples -f mydir --nopfam -miniden 50


Can't find sample file /home/bio508/NewDisk/database/Squeezemeta/test/mydir/SRR1927149_1.fastq.gz for sample SRR1927149 in the samples file. Please check (squeezemeta) bio508@sky:~/NewDisk/database/Squeezemeta/test$ SqueezeMeta.pl -m coassembly -p test -s test.samples -f raw --nopfam -miniden 50


SqueezeMeta v1.6.2, March 2023 - (c) J. Tamames, F. Puente-Sánchez CNB-CSIC, Madrid, SPAIN

Please cite: Tamames & Puente-Sanchez, Frontiers in Microbiology 9, 3349 (2019). doi: https://doi.org/10.3389/fmicb.2018.03349

Run started Fri Aug 30 09:28:48 2024 in coassembly mode 2 metagenomes found: SRR1927149 SRR1929485

Now creating directories Reading configuration from /home/bio508/NewDisk/database/Squeezemeta/test/test/SqueezeMeta_conf.pl [0 seconds]: STEP1 -> RUNNING ASSEMBLY: 01.run_all_assemblies.pl (megahit) Concatenating all samples: SRR1927149_1.fastq.gz SRR1927149_2.fastq.gz SRR1929485_1.fastq.gz SRR1929485_2.fastq.gz pair1: cat /home/bio508/NewDisk/database/Squeezemeta/test/raw/SRR1927149_1.fastq.gz /home/bio508/NewDisk/database/Squeezemeta/test/raw/SRR1929485_1.fastq.gz > /home/bio508/NewDisk/database/Squeezemeta/test/test/data/raw_fastq/par1.fastq.gz pair2: cat /home/bio508/NewDisk/database/Squeezemeta/test/raw/SRR1927149_2.fastq.gz /home/bio508/NewDisk/database/Squeezemeta/test/raw/SRR1929485_2.fastq.gz > /home/bio508/NewDisk/database/Squeezemeta/test/test/data/raw_fastq/par2.fastq.gz Running assembly with megahit: perl /home/bio508/miniconda3/envs/squeezemeta/SqueezeMeta/lib/SqueezeMeta/assembly_megahit.pl /home/bio508/NewDisk/database/Squeezemeta/test/test test /home/bio508/NewDisk/database/Squeezemeta/test/test/data/raw_fastq/par1.fastq.gz /home/bio508/NewDisk/database/Squeezemeta/test/test/data/raw_fastq/par2.fastq.gz Running prinseq (Schmieder et al 2011, Bioinformatics 27(6):863-4) for selecting contigs longer than 200 Input and filter stats: Input sequences: 241,008 Input bases: 212,125,857 Input mean length: 880.16 Good sequences: 241,008 (100.00%) Good bases: 212,125,857 Good mean length: 880.16 Bad sequences: 0 (0.00%) Sequences filtered by specified parameters: none Renaming contigs Running prinseq for contig statistics: /home/bio508/miniconda3/envs/squeezemeta/SqueezeMeta/bin/prinseq-lite.pl -fasta /home/bio508/NewDisk/database/Squeezemeta/test/test/results/01.test.fasta -stats_len -stats_info -stats_assembly > /home/bio508/NewDisk/database/Squeezemeta/test/test/intermediate/01.test.stats Counting length of contigs Contigs stored in /home/bio508/NewDisk/database/Squeezemeta/test/test/results/01.test.fasta Number of contigs: 241008 [54 minutes, 7 seconds]: STEP2 -> RNA PREDICTION: 02.rnas.pl Running barrnap (Seeman 2014, Bioinformatics 30, 2068-9) for predicting RNAs: Bacteria Archaea Eukaryote Mitochondrial Running RDP classifier (Wang et al 2007, Appl Environ Microbiol 73, 5261-7) Running Aragorn (Laslett & Canback 2004, Nucleic Acids Res 31, 11-16) for tRNA/tmRNA prediction [56 minutes, 47 seconds]: STEP3 -> ORF PREDICTION: 03.run_prodigal.pl Running prodigal (Hyatt et al 2010, BMC Bioinformatics 11: 119) for predicting ORFs ORFs predicted: 382149 [1 hours, 12 minutes, 1 seconds]: STEP4 -> HOMOLOGY SEARCHES: 04.rundiamond.pl Setting block size for Diamond AVAILABLE (free) RAM memory: 364.91 Gb We will set Diamond block size to 16 (Gb RAM/8, Max 16). You can override this setting using the -b option when starting the project, or changing the $blocksize variable in SqueezeMeta_conf.pl Working with taxonomy database in /home/bio508/NewDisk/database/Squeezemeta/db/nr.dmnd taxa

What could be the cause of the failure? The server has a total of 512G ram and about 11T usable storage (total 24T). While the Ubuntu system was not installed in the 24T mechanical hard drive but in a SSD, and miniconda3 was installed in the "/home/bio508" directory of the ubuntu system, and the SqueezeMeta pipeline was installed in the miniconda/envs directory.

The "df -h" test get the following output: (base) bio508@sky:~/NewDisk/database/Squeezemeta/test$ df -h Filesystem Size Used Avail Use% Mounted on tmpfs 38G 3.8G 34G 11% /run /dev/nvme0n1p4 33G 33G 0 100% / tmpfs 189G 5.3M 189G 1% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock /dev/nvme0n1p3 15G 272M 14G 2% /boot /dev/nvme0n1p5 413G 223G 169G 57% /home /dev/nvme0n1p1 975M 32M 943M 4% /boot/efi /dev/sda3 22T 11T 11T 50% /home/bio508/NewDisk tmpfs 38G 372K 38G 1% /run/user/1000 /dev/sdb1 932G 301G 631G 33% /media/bio508/Seagate Basic

Could you please identify possible causes of the failure and advise solutions to solve the problem ?

Thanks in advance!

Best, Li

fpusan commented 1 month ago

Exit code -9 in MEGAHIT means that you ran out of memory during assembly. 512G RAM is large but still not enough for large coassemblies. So if the test run finishes correctly then I would not worry. You will need to assemble the different samples independently or do smaller coassemblies by grouping your samples based on some common characteristic.

MSJennyLee commented 1 month ago

Exit code -9 in MEGAHIT means that you ran out of memory during assembly. 512G RAM is large but still not enough for large coassemblies. So if the test run finishes correctly then I would not worry. You will need to assemble the different samples independently or do smaller coassemblies by grouping your samples based on some common characteristic.

Thank you very much for your comments and solutions. The test run finishes correctly and now I'm running in a "merged" mode and it has assembled 12 metagenomes and is still running.