Calculation of alignments step died

AlexSongh commented 2 years ago

Hi, the screenshot of the error log was below.

And my HPC configuration is:

#!/bin/bash

#SBATCH --partition=largemem
#SBATCH --nodes=1
#SBATCH --cpus-per-task=16
#SBATCH --tasks-per-node=1
#SBATCH --mem=800G
#SBATCH --time=60:00:00

source ~/.bashrc
conda activate metaeuk
cd /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification
mkdir -p MetaEuk_results
mkdir -p tmp

EUK_PROFILES="/nfs/turbo/cee-raskin/hangsong/metaeuk_db/MMETSP_uniclust50_MERC_profiles"
metaeuk easy-predict contigs_includingBioFilterEffluent.EUK_KmerMajority_min3000bp_CAT_min1000bp.fasta "${EUK_PROFILES}" MetaEuk_results/predRedRedund_prot.fasta tmp --metaeuk-eval 0.0001 -e 100 --exhaustive-search 1 --min-ungapped-score 35 --min-length 40 --threads 16 --split-memory-limit 300G --min-exon-aa 20 --metaeuk-tcov 0.6 --local-tmp tmp --disk-space-limit 800G;

Do you know any possible reasons to cause this? Are there any solutions? Were any of the flags have problems? Thanks!

milot-mirdita commented 2 years ago

Can you please post the full log output of the job?

AlexSongh commented 2 years ago

Yes, below is my full log. Thank you @milot-mirdita !

easy-predict contigs_includingBioFilterEffluent.EUK_KmerMajority_min3000bp_CAT_min1000bp.fasta /nfs/turbo/cee-raskin/hangsong/metaeuk_db/MMETSP_uniclust50_MERC_profiles MetaEuk_results/predRedRedund_prot.fasta tmp --metaeuk-eval 0.0001 -e 100 --exhaustive-search 1 --min-ungapped-score 35 --min-length 40 --threads 16 --split-memory-limit 300G --min-exon-aa 20 --metaeuk-tcov 0.6 --local-tmp tmp --disk-space-limit 800G

MMseqs Version: 6.a5d39d9 Substitution matrix aa:blosum62.out,nucl:nucleotide.out Add backtrace false Alignment mode 2 Alignment mode 0 Allow wrapped scoring false E-value threshold 100 Seq. id. threshold 0 Min alignment length 0 Seq. id. mode 0 Alternative alignments 0 Coverage threshold 0 Coverage mode 0 Max sequence length 65535 Compositional bias 1 Compositional bias 1 Max reject 2147483647 Max accept 2147483647 Include identical seq. id. false Preload mode 0 Pseudo count a substitution:1.100,context:1.400 Pseudo count b substitution:4.100,context:5.800 Score bias 0 Realign hits false Realign score bias -0.2 Realign max seqs 2147483647 Correlation score weight 0 Gap open cost aa:11,nucl:5 Gap extension cost aa:1,nucl:2 Zdrop 40 Threads 16 Compressed 0 Verbosity 3 Seed substitution matrix aa:VTML80.out,nucl:nucleotide.out Sensitivity 4 k-mer length 0 k-score seq:2147483647,prof:2147483647 Alphabet size aa:21,nucl:5 Max results per query 300 Split database 0 Split mode 2 Split memory limit 300G Diagonal scoring true Exact k-mer matching 0 Mask residues 1 Mask residues probability 0.9 Mask lower case residues 0 Minimum diagonal score 35 Spaced k-mers 1 Spaced k-mer pattern
Local temporary path tmp Rescore mode 0 Remove hits by seq. id. and coverage false Sort results 0 Mask profile 1 Profile E-value threshold 0.001 Global sequence weighting false Allow deletions false Filter MSA 1 Use filter only at N seqs 0 Maximum seq. id. threshold 0.9 Minimum seq. id. 0.0 Minimum score per column -20 Minimum coverage 0 Select N most diverse seqs 1000 Pseudo count mode 0 Gap pseudo count 10 Min codons in orf 40 Max codons in length 32734 Max orf gaps 2147483647 Contig start mode 2 Contig end mode 2 Orf start mode 1 Forward frames 1,2,3 Reverse frames 1,2,3 Translation table 1 Translate orf 0 Use all table starts false Offset of numeric ids 0 Create lookup 0 Add orf stop false Overlap between sequences 0 Sequence split mode 1 Header split mode 0 Chain overlapping alignments 0 Merge query 1 Search type 0 Search iterations 1 Start sensitivity 4 Search steps 1 Exhaustive search mode true Filter results during exhaustive search 0 Strand selection 1 LCA search mode false Disk space limit 800G MPI runner
Force restart with latest tmp false Remove temporary files false maximal combined evalue of an optimal set 0.0001 minimal length ratio between combined optimal set and target 0.6 Maximal intron length 10000 Minimal intron length 15 Minimal exon length aa 20 Maximal overlap of exons 10 Maximal number of exon sets 1 Gap open penalty -1 Gap extend penalty -1 allow same-strand overlaps 0 translate codons to AAs 0 write target key instead of accession 0 write fragment contig coords 0 Reverse AA Fragments 0

createdb contigs_includingBioFilterEffluent.EUK_KmerMajority_min3000bp_CAT_min1000bp.fasta tmp/589055901896411739/contigs --dbtype 2 --compressed 0 -v 3

Converting sequences [= Time for merging to contigs_h: 0h 0m 0s 16ms Time for merging to contigs: 0h 0m 0s 53ms Database type: Nucleotide Time for processing: 0h 0m 0s 542ms Create directory tmp/589055901896411739/tmp_predict Enforcing exhaustive profile search mode due to profile target database predictexons tmp/589055901896411739/contigs /nfs/turbo/cee-raskin/hangsong/metaeuk_db/MMETSP_uniclust50_MERC_profiles tmp/589055901896411739/MetaEuk_calls tmp/589055901896411739/tmp_predict --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 0 --alignment-mode 2 --alignment-output-mode 0 --wrapped-scoring 0 -e 100 --min-seq-id 0 --min-aln-len 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --threads 16 --compressed 0 -v 3 --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -s 4 -k 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seqs 300 --split 0 --split-mode 2 --split-memory-limit 300G --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --min-ungapped-score 35 --spaced-kmer-mode 1 --local-tmp tmp --rescore-mode 0 --filter-hits 0 --sort-results 0 --mask-profile 1 --e-profile 0.001 --wg 0 --allow-deletion 0 --filter-msa 1 --filter-min-enable 0 --max-seq-id 0.9 --qid '0.0' --qsc -20 --cov 0 --diff 1000 --pseudo-cnt-mode 0 --gap-pc 10 --min-length 40 --max-length 32734 --max-gaps 2147483647 --contig-start-mode 2 --contig-end-mode 2 --orf-start-mode 1 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --translate 0 --use-all-table-starts 0 --id-offset 0 --create-lookup 0 --add-orf-stop 0 --sequence-overlap 0 --sequence-split-mode 1 --headers-split-mode 0 --chain-alignments 0 --merge-query 1 --search-type 0 --num-iterations 1 --start-sens 4 --sens-steps 1 --exhaustive-search 1 --exhaustive-search-filter 0 --strand 1 --lca-search 0 --disk-space-limit 800G --force-reuse 0 --remove-tmp-files 0 --metaeuk-eval 0.0001 --metaeuk-tcov 0.6 --max-intron 10000 --min-intron 15 --min-exon-aa 20 --max-overlap 10 --max-exon-sets 1 --set-gap-open -1 --set-gap-extend -1 --reverse-fragments 0

extractorfs /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/contigs /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/nucl_6f --min-length 40 --max-length 32734 --max-gaps 2147483647 --contig-start-mode 2 --contig-end-mode 2 --orf-start-mode 1 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --translate 0 --use-all-table-starts 0 --id-offset 0 --create-lookup 0 --threads 16 --compressed 0 -v 3

[=================================================================] 15.20K 0s 406ms Time for merging to nucl_6f_h: 0h 0m 0s 630ms Time for merging to nucl_6f: 0h 0m 0s 858ms Time for processing: 0h 0m 2s 372ms translatenucs /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/nucl_6f /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/aa_6f --translation-table 1 --add-orf-stop 0 -v 3 --compressed 0 --threads 16

[=================================================================] 470.66K 0s 958ms Time for merging to aa_6f: 0h 0m 0s 503ms Time for processing: 0h 0m 1s 536ms Create directory /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/tmp_search search /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/aa_6f /nfs/turbo/cee-raskin/hangsong/metaeuk_db/MMETSP_uniclust50_MERC_profiles /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/search_res /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/tmp_search --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 0 --alignment-mode 2 --alignment-output-mode 0 --wrapped-scoring 0 -e 100 --min-seq-id 0 --min-aln-len 20 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --threads 16 --compressed 0 -v 3 --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -s 4 -k 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seqs 300 --split 0 --split-mode 2 --split-memory-limit 300G --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --min-ungapped-score 35 --spaced-kmer-mode 1 --local-tmp tmp --rescore-mode 0 --filter-hits 0 --sort-results 0 --mask-profile 1 --e-profile 0.001 --wg 0 --allow-deletion 0 --filter-msa 1 --filter-min-enable 0 --max-seq-id 0.9 --qid '0.0' --qsc -20 --cov 0 --diff 1000 --pseudo-cnt-mode 0 --gap-pc 10 --min-length 40 --max-length 32734 --max-gaps 2147483647 --contig-start-mode 2 --contig-end-mode 2 --orf-start-mode 1 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --translate 0 --use-all-table-starts 0 --id-offset 0 --create-lookup 0 --add-orf-stop 0 --sequence-overlap 0 --sequence-split-mode 1 --headers-split-mode 0 --chain-alignments 0 --merge-query 1 --search-type 0 --num-iterations 1 --start-sens 4 --sens-steps 1 --exhaustive-search 1 --exhaustive-search-filter 0 --strand 1 --lca-search 0 --disk-space-limit 800G --force-reuse 0 --remove-tmp-files 0

prefilter /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/tmp_search/13289217149420800099/profileDB /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/aa_6f /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/tmp_search/13289217149420800099/pref --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -s 4 -k 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seq-len 65535 --max-seqs 470660 --split 0 --split-mode 2 --split-memory-limit 300G -c 0 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --min-ungapped-score 35 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --local-tmp tmp --threads 16 --compressed 0 -v 3

Query database size: 73003 type: Profile Estimated memory consumption: 1G Target database size: 470660 type: Aminoacid Index table k-mer threshold: 0 at k-mer size 6 Index table: counting k-mers [=================================================================] 470.66K 0s 931ms Index table: Masked residues: 7546309 Index table: fill [=================================================================] 470.66K 0s 608ms Index statistics Entries: 40256141 DB size: 718 MB Avg k-mer size: 0.629002 Top 10 k-mers RRRRRR 2160 RRGRRR 1328 GRRRRR 1311 PRRRRR 1108 RPRRRR 1050 ARRRRR 1038 RGGRRR 949 GRGRRR 860 GGRRRR 780 ARGRRR 691 Time for index table init: 0h 0m 1s 993ms Process prefiltering step 1 of 1

k-mer similarity threshold: 109 Starting prefiltering scores calculation (step 1 of 1) Query db start 1 to 73003 Target db start 1 to 470660 [=================================================================] 73.00K 6h 51m 49s 207ms

573653.309786 k-mers per position 63152310 DB matches per sequence 25122 overflows 0 queries produce too many hits (truncated result) 36273 sequences passed prefiltering per query sequence 0 median result list length 45110 sequences with 0 size result lists Time for merging to pref: 0h 0m 0s 128ms Time for processing: 6h 55m 11s 449ms result2stats /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/tmp_search/13289217149420800099/profileDB /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/aa_6f /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/tmp_search/13289217149420800099/pref /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/tmp_search/13289217149420800099/pref_count.tsv --stat linecount --tsv --threads 16 --compressed 0 -v 3

[=================================================================] 73.00K 5m 25s 242ms Time for merging to pref_count.tsv: 0h 0m 0s 160ms Time for processing: 0h 5m 27s 14ms Score of forward/backward SW differ: 509 508. Q: 6796 T: 391823. Start: Q: 3, T: 8. End: Q: 164, T 165 align /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/tmp_search/13289217149420800099/profileDB /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/aa_6f /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/tmp_search/13289217149420800099/pref /scratch/raskin_root/raskin0/hangsong/Ann_Arbor.UMich/contigs_includingBioFilterEffluent_EUK_classification/tmp/589055901896411739/tmp_predict/9835964812102503863/tmp_search/13289217149420800099/aln --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' -a 0 --alignment-mode 2 --alignment-output-mode 1 --wrapped-scoring 0 -e 0.534933 --min-seq-id 0 --min-aln-len 20 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --comp-bias-corr 1 --comp-bias-corr-scale 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --score-bias 0 --realign 0 --realign-score-bias -0.2 --realign-max-seqs 2147483647 --corr-score-weight 0 --gap-open aa:11,nucl:5 --gap-extend aa:1,nucl:2 --zdrop 40 --threads 16 --compressed 0 -v 3

Compute score and coverage Query database size: 73003 type: Profile Target database size: 470660 type: Aminoacid Calculation of alignments [Score of forward/backward SW differ: 258 256. Q: 28448 T: 243727. Start: Q: 3, T: 0. End: Q: 90, T 102 Score of forward/backward SW differ: 2522 2530. Q: 6797 T: 61775. Start: Q: 15, T: 0. End: Q: 214, T 191 Error: align died Error: search step died Error: predictexons step died

Tamtatatam commented 2 years ago

...I'll join this thread, since I came across exactly this problem today :). I fear it might be about the db size. A smaler db worked without problems for me. Would love to hear if it is possible to get around this problem.

AlexSongh commented 2 years ago

...I'll join this thread, since I came across exactly this problem today :). I fear it might be about the db size. A smaler db worked without problems for me. Would love to hear if it is possible to get around this problem.

Thank you so much for your reply @Tamtatatam ! May I ask what protein db did you use? Or did you do any manipulation of the original protein db profile? Thank you!

Tamtatatam commented 2 years ago

I used EggNOG for the run that failed, and the EggNOG db was created using mmseq with mmseqs databases eggNOG /my/db/path/EggNOG tmp and mmseqs createdb /my/db/path/EggNOG targetDB.

For the test runs that worked without any issues I used smaller custom made reference datasets (subsets of NR) and ran MetEuk either directly on the fasta or after first creating a database. That's why I assumed it is about the size, but maybe it is about how the db is created? `

AlexSongh commented 2 years ago

@Tamtatatam Thank you for your reply！ How big are your subsets of NR? I used the protein profile from this link (https://wwwuser.gwdg.de/~compbiol/metaeuk/2019_11/), and it is 295 G.

How did you make the smaller NR subset? If it's a smaller db, will some of the results be compromised since they won't be found and aligned? Thank you!

milot-mirdita commented 2 years ago

This is not supposed to happen. Can you downgrade to the previous metaeuk version, while we investigate whats going on? The previous one should work correctly with the profile database we provide.

AlexSongh commented 2 years ago

This is not supposed to happen. Can you downgrade to the previous metaeuk version, while we investigate whats going on? The previous one should work correctly with the profile database we provide.

I'll try that and let you know how it goes @milot-mirdita, thank you!

AlexSongh commented 2 years ago

@milot-mirdita The previous version works! Thanks for the help!

gjordaopiedade commented 1 year ago

Hi, I see this issue is still open. I seem to be having the exact same issue (using MMETSP_uniclust50_MERC_profiles). I have tried both installing with conda and AVX2 build. When it gets to the alignment it runs out of memory (I have 1T available).

Compute score and coverage
Query database size: 1829 type: Profile
Target database size: 92457309 type: Aminoacid
Calculation of alignments
[=======Error: align died
Error: search step died
Error: predictexons step died

Do you think I should try an older version? Where can I find it?

Thank you!!

soedinglab / metaeuk

Calculation of alignments step died #51