Closed gracegyho closed 8 months ago
Hello
You are rigth, sqm_annot.pl calls the main SqueezeMeta program but forgets to pass the specified number of threads, silly script!
For fixing this, edit the sqm_annot.pl script and do the following:
Change line 71 to:
my $result = GetOptions ("t=i" => \($numthreads=12),
Change line 115 to:
my $command="perl $scriptdir/SqueezeMeta.pl -s $tempsample -f $aadir -m sequential $edb $blockoption -t $numthreads --nopfam -c 0 --empty";
Best,
J
Nevertheless, issue #629 was happening because there was just one sequence to annotate. As multithreading here is managing dividing the input in blocks of sequences, it failed because a single sequence cannot be divided. That's why it worked when putting -t 1. In your case, it seems to be a matter of RAM memory instead. Best, J
Change line 115 to:
my $command="perl $scriptdir/SqueezeMeta.pl -s $tempsample -f $aadir -m sequential $edb $blockoption -t $numthreads --nopfam -c 0 --empty";
Best, J
So for whatever reason this was on line 111 and I ran it with your recommended changes (line numbers didnt match though)
Running sqm_annot.pl -h
, I get the error:
Global symbol "$edb" requires explicit package name (did you forget to declare "my $edb"?) at /home/gho/miniconda3/envs/SqueezeMeta/bin/sqm_annot_backup.pl line 111.
Execution of /home/gho/miniconda3/envs/SqueezeMeta/bin/sqm_annot_backup.pl aborted due to compilation errors.
So I removed $edb
from this line and running sqm_annot.pl -h
gives the help message. I'll run it with my data again and see if it works. Changed line 71 (which was also on a different line). So far the config file in the project directory says numthreads=24, so there's one issue (sorta) fixed :)
Thanks so far! I'm amazed by your and @fpusan's responsivity!
EDIT: Same error but now with 24 threads terminating abnormally.
I have 1.9 TB of RAM (at least in this interactive slurm session, it seems I forgot to specify memory, oops). So it's not a matter of me running out, right?
Hi again. Got caught in a crazy november and lost track of this. Are you still experiencing memory issues?
Closing due to lack of activity, hope you managed to fix this, otherwise feel free to reopen!
I am currently running sqm_annot.pl on a database with 494 amino acid sequences. Actually, I ran into problems running it first on about 240k sequences, but after crashing the first time I tried with a specific subset.
I seem to be getting an error similar to #629 with sqm_annot.pl
At first I tried to annotate all predicted proteins in a single metagenome (237394 sequences). Then it crashed with a similar message as below.
Then I subset the metagenome to 494 sequences of interest (basically, proteins which I detected from a different analysis) and ran it again.
Output
Output of `sqm_annot.pl -s samplefile_Day20200326_subset.txt -f Dummy_Dir/ -t 24 -b 16`: ``` SQM_annot v1.6.3, September 2023 - (c) J. Tamames, F. Puente-Sánchez CNB-CSIC, Madrid, SPAIN This is part of the SqueezeMeta distribution (https://github.com/jtamames/SqueezeMeta) Please cite: Tamames & Puente-Sanchez, Frontiers in Microbiology 10.3389 (2019). doi: https://doi.org/10.3389/fmicb.2018.03349 Now I will call SqueezeMeta to do my stuff. Please hold on. *** SqueezeMeta v1.6.3, September 2023 - (c) J. Tamames, F. Puente-Sánchez CNB-CSIC, Madrid, SPAIN Please cite: Tamames & Puente-Sanchez, Frontiers in Microbiology 9, 3349 (2019). doi: https://doi.org/10.3389/fmicb.2018.03349 Run started Fri Oct 27 11:01:06 2023 in sequential mode 1 metagenomes found: Day_20200326_mapped --- SAMPLE Day_20200326_mapped --- Now creating directories Reading configuration from /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/SqueezeMeta_conf.pl Running trimmomatic (Bolger et al 2014, Bioinformatics 30(15):2114-20) for quality filtering Parameters: Directory structure and conf files created. Exiting Working with Day_20200326_mapped Working with taxonomy database in /bioinf/home/gho/databases/SqueezeMeta/db/nr.dmnd taxa COGS Running Diamond (Buchfink et al 2015, Nat Methods 12, 59-60) for KEGG Splitting Diamond file Starting multithread LCA in 12 threads DBD::SQLite::db prepare failed: Expression tree is too large (maximum depth 1000) at /home/gho/miniconda3/envs/SqueezeMeta/SqueezeMeta/scripts/06.lca.pl line 254. Thread 1 terminated abnormally: DBD::SQLite::db prepare failed: Expression tree is too large (maximum depth 1000) at /home/gho/miniconda3/envs/SqueezeMeta/SqueezeMeta/scripts/06.lca.pl line 254. Thread 2 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.2.m8 Thread 3 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.3.m8 Thread 4 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.4.m8 Thread 5 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.5.m8 Thread 6 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.6.m8 Thread 7 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.7.m8 Thread 8 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.8.m8 Thread 9 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.9.m8 Thread 10 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.10.m8 Thread 11 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.11.m8 Thread 12 terminated abnormally: Cannot open /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/temp/diamond_lca.12.m8 Creating /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/results/06.Day_20200326_mapped.fun3.tax.wranks file Creating /scratch/gho/sqm_annot_Day-20200326_subset/Day_20200326_mapped/results/06.Day_20200326_mapped.fun3.tax.noidfilter.wranks file Functional assignment for COGS KEGG Taxonomic assignment stored in Day_20200326_mapped/results/06.Day_20200326_mapped.fun3.tax.wranks COG functional assignment stored in Day_20200326_mapped/results/07.Day_20200326_mapped.fun3.cog KEGG functional assignment stored in Day_20200326_mapped/results/07.Day_20200326_mapped.fun3.kegg COG summary created in Day_20200326_mapped/results/COG.summary KEGG summary created in Day_20200326_mapped/results/KEGG.summary Have a nice day! ```
And at this point I came across #629 , and thought to run it with the option -t 1, on one thread. And I got the same errors, threads crashing etc. Furthermore when I go into the project folder and check
SqueezeMeta_conf.pl
$numthreads is set to 12, despite specifying -t 1.And actually the config file of the run where t=24 is also saying $numthreads is =12.
sqm_annot.pl worked without problems on another faa file with 8214 sequences with parameters -t 24 -b 16. I checked the conf.pl file again here, and $numthreads is again =12 instead of 24.
Is there something wrong with the thread specification line in sqm_annot.pl?