soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
MIT License
1.47k stars 200 forks source link

Error: Prefilter died #564

Closed Guillaume-Marseille closed 2 years ago

Guillaume-Marseille commented 2 years ago

Expected Behavior

running blastp-like search against TrEMBL

Current Behavior

Crash at prefilter stage

Steps to Reproduce (for bugs)

mmseqs search tmpDir/tmp_Juil.D465_1000nt.fasta.2000.pep/QUERY.mms /shared/projects/phycovir/FORMATED_DB/TrEMBL/TrEMBL tmpDir/tmp_Juil.D465_1000nt.fasta.2000.pep/RESULT.mms tmpDir/tmp_Juil.D465_1000nt.fasta.2000.pep

MMseqs Output (for bugs)

MMseqs Version: 113e3212c137d026e297c7540e1fcd039f6812b1 Substitution matrix nucl:nucleotide.out,aa:blosum62.out Add backtrace false Alignment mode 2 Allow wrapped scoring false E-value threshold 0.001 Seq. id. threshold 0 Min alignment length 0 Seq. id. mode 0 Alternative alignments 0 Coverage threshold 0 Coverage mode 0 Max sequence length 65535 Compositional bias 1 Realign hits false Max reject 2147483647 Max accept 2147483647 Include identical seq. id. false Preload mode 0 Pseudo count a 1 Pseudo count b 1.5 Score bias 0 Gap open cost nucl:5,aa:11 Gap extension cost nucl:2,aa:1 Zdrop 40 Threads 32 Compressed 0 Verbosity 3 Seed substitution matrix nucl:nucleotide.out,aa:VTML80.out Sensitivity 5.7 k-mer length 0 k-score 2147483647 Alphabet size nucl:5,aa:21 Max results per query 300 Split database 0 Split mode 2 Split memory limit 0 Diagonal scoring true Exact k-mer matching 0 Mask residues 1 Mask lower case residues 0 Minimum diagonal score 15 Spaced k-mers 1 Spaced k-mer pattern
Local temporary path
Rescore mode 0 Remove hits by seq. id. and coverage false Sort results 0 Mask profile 1 Profile E-value threshold 0.1 Global sequence weighting false Allow deletions false Filter MSA 1 Maximum seq. id. threshold 0.9 Minimum seq. id. 0 Minimum score per column -20 Minimum coverage 0 Select N most diverse seqs 1000 Omit consensus false Min codons in orf 30 Max codons in length 32734 Max orf gaps 2147483647 Contig start mode 2 Contig end mode 2 Orf start mode 1 Forward frames 1,2,3 Reverse frames 1,2,3 Translation table 1 Translate orf 0 Use all table starts false Offset of numeric ids 0 Create lookup 0 Add orf stop false Chain overlapping alignments 0 Merge query 1 Search type 0 Search iterations 1 Start sensitivity 4 Search steps 1 Slice search mode false Strand selection 1 Disk space limit 0 MPI runner
Force restart with latest tmp false Remove temporary files false

prefilter tmpDir/tmp_Juil.D465_1000nt.fasta.2000.pep/QUERY.mms /shared/projects/phycovir/FORMATED_DB/TrEMBL/TrEMBL tmpDir/tmp_Juil.D465_1000nt.fasta.2000.pep/9777472437024274047/pref_0 --sub-mat nucl:nucleotide.out,aa:blosum62.out --seed-sub-mat nucl:nucleotide.out,aa:VTML80.out -k 0 --k-score 2147483647 --alph-size nucl:5,aa:21 --max-seq-len 65535 --max-seqs 300 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca 1 --pcb 1.5 --threads 32 --compressed 0 -v 3 -s 5.7

Query database size: 446 type: Aminoacid Target split mode. Searching through 12 splits Estimated memory consumption: 91G Target database size: 230328648 type: Aminoacid Process prefiltering step 1 of 12

Index table k-mer threshold: 122 at k-mer size 7 Index table: counting k-mers tmpDir/tmp_Juil.D465_1000nt.fasta.2000.pep/9777472437024274047/blastp.sh : ligne 99 : 10291 Instruction non permise $RUNNER "$MMSEQS" prefilter "$INPUT" "$TARGET" "$TMPPATH/pref$STEP" $PREFILTER_PAR -s "$SENS" Error: Prefilter died

Context

TrEMBL installed using the mmseqs databases command

Your Environment

Operating system and version: Linux version 3.10.0-1160.6.1.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) )

milot-mirdita commented 2 years ago

This was already a while ago, however Instruction non permise points to wrong MMseqs2 binary used. Please try the sse41 or sse2 binaries or compile MMseqs2 on the same machine where you use it yourself.