jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
384 stars 81 forks source link

Run stops at diamond process #559

Closed sakcham closed 1 year ago

sakcham commented 2 years ago

Hi, My program just stops at step 4, without any errors, the terminal freezes and 2 days later program crashes. I have updated to the latest version of SqueezeMeta. test_install.pl works fine! Also using the merged mode to save memory. Please help!!!

jtamames commented 2 years ago

If the terminal freezes, likely you have run out of RAM. How much does your computer have?

sakcham commented 2 years ago

If the terminal freezes, likely you have run out of RAM. How much does your computer have?

I have 68 GB RAM! Is that not enough? will low memory settings help? can it be applied at restart or do i need to run the whole thing again?

sakcham commented 2 years ago

I am thinking some other bug. That is what my method file looks like.

Analysis done with SqueezeMeta v1.6.0, September 2022 (Tamames & Puente-Sanchez 2019, Frontiers in Microbiology 9, 3349) Contig statistics were done using prinseq (Schmieder et al 2011, Bioinformatics 27(6):863-4) Contig statistics were done using prinseq (Schmieder et al 2011, Bioinformatics 27(6):863-4) RNAs were predicted using Barrnap (Seeman 2014, Bioinformatics 30, 2068-9) 16S rRNA sequences were taxonomically classified using the RDP classifier (Wang et al 2007, Appl Environ Microbiol 73, 5261-7) tRNA/tmRNA sequences were predicted using Aragorn (Laslett & Canback 2004, Nucleic Acids Res 31, 11-16) ORFs were predicted using Prodigal (Hyatt et al 2010, BMC Bioinformatics 11: 119) Similarity searches for Similarity searches for Similarity searches for Similarity searches for

This is what system log looks like.

[3 hours, 25 minutes, 37 seconds]: STEP4 -> 04.rundiamond.pl Diamond block size set to 7.4 (Free Mem 58.90 Gb) Running Diamond for taxa: /home/poorna/anaconda3/envs/SqueezeMeta_Oct2022/SqueezeMeta/bin/diamond blastp -q /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/03.Plastic_trial_merged.faa -p 12 -d /home/poorna/sakcham/database/db/nr.dmnd -e 0.001 --id 40 -f tab -b 7.4 --quiet -o /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.nr.diamond

This ran for almost 48 hours. for 24 of it the system was not responding.

fpusan commented 2 years ago

Did you have any other process running apart from SqueezeMeta? Can you run /home/poorna/anaconda3/envs/SqueezeMeta_Oct2022/SqueezeMeta/bin/diamond blastp -q /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/03.Plastic_trial_merged.faa -p 12 -d /home/poorna/sakcham/database/db/nr.dmnd -e 0.001 --id 40 -f tab -b 2 -o /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.nr.diamond

This will hopefully use less RAM and also inform of the progress of DIAMOND.

sakcham commented 2 years ago

Running now! Will update the progress!

sakcham commented 2 years ago

Works! run finished! How do i get the rest of the pipeline to run? just restart from step 5? or run the diamond for kegg and cogs and then restart from step 5?

jtamames commented 2 years ago

Yes, run diamond for cogs and kegg and restart in step 05. Best, J

sakcham commented 2 years ago

Thanks!

fpusan commented 2 years ago

Calling 04.rundiamond.pl Plastic_trial_merged 1 should run the script excluding the blast against nr (which you did manually) Then restarting from step 5 should work

sakcham commented 2 years ago

Hi, the program stopped at step 7 with the following error.

[2 days, 23 hours, 46 minutes, 20 seconds]: STEP7 -> 07.fun3assign.pl Reading COGs hits from /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.eggnog.diamond Output in /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/07.Plastic_trial_merged.fun3.cog Reading COGs hits from /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.eggnog.diamond Stopping in STEP7 -> 07.fun3assign.pl. Program finished abnormally

I cannot figure out the problem, please help!

sakcham commented 2 years ago

Error shown on the command line:

Functional assignment for COGS KEGGIllegal division by zero at /home/poorna/anaconda3/envs/SqueezeMeta_Oct2022/SqueezeMeta/scripts/07.fun3assign.pl line 155, line 0.000000. Stopping in STEP7 -> 07.fun3assign.pl. Program finished abnormally

sakcham commented 2 years ago

Some more insights, Step 7 does not run for COG and KEGG assignments (ran manually as discussed above). 07.Plastic_trial_merged.fun3.cog and 07.Plastic_trial_merged.fun3.kegg files are empty. It runs for PFAM annotations which run using HMMER from step 5 (restart function after manual run of step 4). Maybe it has something to do with the manual running of the diamond step?

i ran them using the command home/poorna/anaconda3/envs/SqueezeMeta_Oct2022/SqueezeMeta/bin/diamond blastp -q /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/03.Plastic_trial_merged.faa -p 12 -d /home/poorna/sakcham/database/db/eggnog -e 0.001 --id 40 -f tab -b 2 -o /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.eggnog.diamond

and

/home/poorna/anaconda3/envs/SqueezeMeta_Oct2022/SqueezeMeta/bin/diamond blastp -q /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/03.Plastic_trial_merged.faa -p 12 -d /home/poorna/sakcham/database/db/keggdb -e 0.001 --id 40 -f tab -b 2 -o /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.kegg.diamond

then ran SqueezeMeta.pl -p Plastic_trial_merged --restart -step 5

jtamames commented 2 years ago

Hello Please check that the diamond runs finished correctly. What is the result of ls -l /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/

fpusan commented 2 years ago

Reading COGs hits from /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.eggnog.diamond Output in /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/07.Plastic_trial_merged.fun3.cog Reading COGs hits from /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.eggnog.diamond

Just noticed that this actually worked for the COGs but failed for the KEGG annotations (it says that it is reading the COG hits twice, but this is a typo on our part which I just corrected in 33815c1. So you running things manually apparently worked at least for the COGs? Just to confirm this, can you check whether there is a valid output in /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/07.Plastic_trial_merged.fun3.cog?

sakcham commented 2 years ago

Hello Please check that the diamond runs finished correctly. What is the result of ls -l /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/

total 24142792 -rw-rw-r-- 1 poorna poorna 82900323 Oct 10 14:26 01.Plastic_trial_merged.lon -rw-rw-r-- 1 poorna poorna 318 Oct 10 14:25 01.Plastic_trial_merged.stats -rw-rw-r-- 1 poorna poorna 2880872067 Oct 10 15:06 02.Plastic_trial_merged.maskedrna.fasta -rw-rw-r-- 1 poorna poorna 7246606227 Oct 18 16:40 04.Plastic_trial_merged.eggnog.diamond -rw-rw-r-- 1 poorna poorna 5427433247 Oct 18 18:37 04.Plastic_trial_merged.kegg.diamond -rw-rw-r-- 1 poorna poorna 8228320883 Oct 15 12:21 04.Plastic_trial_merged.nr.diamond -rw-rw-r-- 1 poorna poorna 856046820 Oct 20 08:52 05.Plastic_trial_merged.pfam.hmm drwxrwxr-x 2 poorna poorna 4096 Oct 10 14:22 binners -rw-rw-r-- 1 poorna poorna 56 Oct 12 12:18 DB_BUILD_DATE

I think the runs finished correctly!

sakcham commented 2 years ago

Reading COGs hits from /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.eggnog.diamond Output in /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/07.Plastic_trial_merged.fun3.cog Reading COGs hits from /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.eggnog.diamond

Just noticed that this actually worked for the COGs but failed for the KEGG annotations (it says that it is reading the COG hits twice, but this is a typo on our part which I just corrected in 33815c1. So you running things manually apparently worked at least for the COGs? Just to confirm this, can you check whether there is a valid output in /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/07.Plastic_trial_merged.fun3.cog?

This was because the files were created but empty with only headers for both COG and KEGG. It worked for PFAM (which i did not run manually).

sakcham commented 2 years ago

I tried to restart the run from step 7 after deleting the created files, still the same error.

SqueezeMeta.pl -p Plastic_trial_merged --restart -step 7

SqueezeMeta v1.6.0, September 2022 - (c) J. Tamames, F. Puente-Sánchez CNB-CSIC, Madrid, SPAIN

Please cite: Tamames & Puente-Sanchez, Frontiers in Microbiology 9, 3349 (2019). doi: https://doi.org/10.3389/fmicb.2018.03349

Run started Mon Nov 7 13:49:51 2022 in merged mode 14 metagenomes found: LCK_D3_GLASS LCK_D3_HDPE LCK_D3_OXO LCK_D3_PET LCK_D3_PP LCK_D3_S LCK_D3_W LCK_D14_GLASS LCK_D14_HDPE LCK_D14_OXO LCK_D14_PET LCK_D14_PP LCK_D14_S LCK_D14_W

[0 seconds]: STEP7 -> FUNCTIONAL ASSIGNMENT: 07.fun3assign.pl Functional assignment for COGS KEGGIllegal division by zero at /home/poorna/anaconda3/envs/SqueezeMeta_Oct2022/SqueezeMeta/scripts/07.fun3assign.pl line 155, line 0.000000. Stopping in STEP7 -> 07.fun3assign.pl. Program finished abnormally

jtamames commented 2 years ago

Ok, now I see. The way of running diamond is different for COGs and KEGG. Run it this way, for COGs: home/poorna/anaconda3/envs/SqueezeMeta_Oct2022/SqueezeMeta/bin/diamond blastp -q /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/03.Plastic_trial_merged.faa -p 12 -d /home/poorna/sakcham/database/db/eggnog -e 1e-03 --id 40 --quiet -b 2 -f 6 qseqid qlen sseqid slen pident length evalue bitscore qstart qend sstart send -o /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/04.Plastic_trial_merged.eggnog.diamond

And the same for KEGGs, changing the database. Best, J

sakcham commented 2 years ago

Ok, that worked but new error!

Reading samples from /home/poorna/meta_plastic_trial/Plastic_trial_merged/data/00.Plastic_trial_merged.samples Metagenomes found: 14 Mapping with Bowtie2 (Langmead and Salzberg 2012, Nat Methods 9(4), 357-9) Creating reference from contigs Working with sample 1: LCK_D3_GLASS Getting raw reads Aligning to reference with bowtie Calculating contig coverage Reading contig length from /home/poorna/meta_plastic_trial/Plastic_trial_merged/intermediate/01.Plastic_trial_merged.lon Counting with sqm_counter: Opening 12 threads 3152011 reads counted 6304021 reads counted 9456031 reads counted Stopping in STEP10 -> 10.mapsamples.pl. Program finished abnormally

On the system log Calling sqm_counter: Sample LCK_D3_GLASS, SAM /home/poorna/meta_plastic_trial/Plastic_trial_merged/data/sam/Plastic_trial_merged.LCK_D3_GLASS.sam, Number of reads 37824124, GFF /home/poorna/meta_plastic_trial/Plastic_trial_merged/results/03.Plastic_trial_merged.gff Stopping in STEP10 -> 10.mapsamples.pl. Program finished abnormally

jtamames commented 1 year ago

Sorry, somehow I missed track of this issue. If you are still dealing with this, please tell mewhat you can see in the temp directory. Look for some files names like count.1, count.2... and check the sizes Best, J

fpusan commented 1 year ago

Closing due to lack of activity, feel free to reopen!