Expected Behavior

I ran this script with mmseqs2 part shown below and had a prefilter died error. What should I do?

!/bin/bash

## specify allocation - we want normal since we don't want to use the whole node for nothing SBATCH -A grp-org-sc SBATCH -q normal ## specify number of nodes SBATCH -N 2 ## specify number of procs/CPUS SBATCH -c 8 ## specify runtime SBATCH -t 72:00:00 ## specify job name SBATCH -J seqdetect ##Memory per cpu SBATCH --mem-per-cpu=512G

export PATH=$PATH:/groups/science/homes/username/anaconda3/bin/mmseqs [Initial part of the script for pre-processing abbreviated here] ### MMseqs2

conda activate /groups/science/homes/username/.micromamba/envs/mmseqs2

export PATH=$PATH:/groups/science/homes/username/anaconda3/bin/mmseqs mkdir mmseqs_target_seq/ mkdir mmseqs_target_seq/${sample} mkdir phrog_output/ cp previousstep_output/${sample}/${sample}_summary/${sample}_targetofinterest_proteins.faa mmseqs_target_seq/${sample}/${sample}_targetofinterest_proteins.faa mmseqs createdb mmseqs_target_seq/${sample}/${sample}_targetofinterest_proteins.faa mmseqs_target_seq/${sample}/${sample}_targetofinterest_proteins.target_seq

### MMseqs2/Phrogs mmseqs search phrogs_mmseqs_db/phrogs_profile_db \ mmseqs_target_seq/${sample}/${sample}_targetofinterest_proteins.target_seq \ mmseqs_target_seq/${sample}/${sample}_targetofinterest_proteins_mmseqs \ mmseqs_target_seq/${sample}/tmp -s 7

mmseqs createtsv phrogs_mmseqs_db/phrogs_profile_db \ mmseqs_target_seq/${sample}/${sample}_targetofinterest_proteins.target_seq \ mmseqs_target_seq/${sample}/${sample}_targetofinterest_proteins_mmseqs \ mmseqs_target_seq/${sample}/${sample}_targetofinterest_proteins_mmseqs.tsv --full-header

cp mmseqs_target_seq/${sample}/${sample}_targetofinterest_proteins_mmseqs.tsv mmseqs_target_seq echo "file: mmseqs_target_seq/${sample}_targetofinterest_proteins_mmseqs.tsv"

Current Behavior

[Previous output omitted here] Create directory mmseqs_target_seq/[bacteria_of_interest]/tmp search phrogs_mmseqs_db/phrogs_profile_db mmseqs_target_seq/[bacteria_of_interest]/[bacteria_of_interest]_targetofinterest_proteins.target_seq mmseqs_target_seq/[bacteria_of_interest]/[bacteria_of_interest]_targetofinterest_proteins_mmseqs mmseqs_target_seq/[bacteria_of_interest]/tmp -s 7

MMseqs Version: 14.7e284 Substitution matrix aa:blosum62.out,nucl:nucleotide.out Add backtrace false Alignment mode 2 Alignment mode 0 Allow wrapped scoring false E-value threshold 0.001 Seq. id. threshold 0 Min alignment length 0 Seq. id. mode 0 Alternative alignments 0 Coverage threshold 0 Coverage mode 0 Max sequence length 65535 Compositional bias 1 Compositional bias 1 Max reject 2147483647 Max accept 2147483647 Include identical seq. id. false Preload mode 0 Pseudo count a substitution:1.100,context:1.400 Pseudo count b substitution:4.100,context:5.800 Score bias 0 Realign hits false Realign score bias -0.2 Realign max seqs 2147483647 Correlation score weight 0 Gap open cost aa:11,nucl:5 Gap extension cost aa:1,nucl:2 Zdrop 40 Threads 64 Compressed 0 Verbosity 3 Seed substitution matrix aa:VTML80.out,nucl:nucleotide.out Sensitivity 7 k-mer length 0 k-score seq:2147483647,prof:2147483647 Alphabet size aa:21,nucl:5 Max results per query 300 Split database 0 Split mode 2 Split memory limit 0 Diagonal scoring true Exact k-mer matching 0 Mask residues 1 Mask residues probability 0.9 Mask lower case residues 0 Minimum diagonal score 15 Selected taxa
Spaced k-mers 1 Spaced k-mer pattern
Local temporary path
Rescore mode 0 Remove hits by seq. id. and coverage false Sort results 0 Mask profile 1 Profile E-value threshold 0.1 Global sequence weighting false Allow deletions false Filter MSA 1 Use filter only at N seqs 0 Maximum seq. id. threshold 0.9 Minimum seq. id. 0.0 Minimum score per column -20 Minimum coverage 0 Select N most diverse seqs 1000 Pseudo count mode 0 Gap pseudo count 10 Min codons in orf 30 Max codons in length 32734 Max orf gaps 2147483647 Contig start mode 2 Contig end mode 2 Orf start mode 1 Forward frames 1,2,3 Reverse frames 1,2,3 Translation table 1 Translate orf 0 Use all table starts false Offset of numeric ids 0 Create lookup 0 Add orf stop false Overlap between sequences 0 Sequence split mode 1 Header split mode 0 Chain overlapping alignments 0 Merge query 1 Search type 0 Search iterations 1 Start sensitivity 4 Search steps 1 Exhaustive search mode false Filter results during exhaustive search 0 Strand selection 1 LCA search mode false Disk space limit 0 MPI runner
Force restart with latest tmp false Remove temporary files false

prefilter phrogs_mmseqs_db/phrogs_profile_db mmseqs_target_seq/[bacteria_of_interest]/[bacteria_of_interest]_targetofinterest_proteins.target_seq mmseqs_target_seq/[bacteria_of_interest]/tmp/15822818178659183495/pref_0 --sub-mat 'aa:blosum62.out,nucl:nucleotide.out' --seed-sub-mat 'aa:VTML80.out,nucl:nucleotide.out' -k 0 --k-score seq:2147483647,prof:2147483647 --alph-size aa:21,nucl:5 --max-seq-len 65535 --max-seqs 300 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --comp-bias-corr-scale 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-prob 0.9 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca substitution:1.100,context:1.400 --pcb substitution:4.100,context:5.800 --threads 64 --compressed 0 -v 3 -s 7.0

Query database size: 38880 type: Profile Estimated memory consumption: 488M Target database size: 125 type: Aminoacid Index table k-mer threshold: 0 at k-mer size 6 Index table: counting k-mers [=================================================================] 125 0s 5ms Index table: Masked residues: 124 Index table: fill [=================================================================] 125 0s 6ms Index statistics Entries: 25103 DB size: 488 MB Avg k-mer size: 0.000392 Top 10 k-mers ALGLAA 2 TTGTAA 2 AAARKA 2 KASRKA 2 TEEALA 2 EDLLRA 2 INGNED 2 ASARED 2 GKHHRD 2 AELKAE 2 Time for index table init: 0h 0m 0s 511ms Process prefiltering step 1 of 1

k-mer similarity threshold: 91 Starting prefiltering scores calculation (step 1 of 1) Query db start 1 to 38880 Target db start 1 to 125 [=mmseqs_target_seq/[bacteria_of_interest]/tmp/15822818178659183495/blastp.sh: line 99: 1649148 Killed $RUNNER "$MMSEQS" prefilter "$INPUT" "$TARGET" "$TMPPATH/pref$STEP" $PREFILTER_PAR -s "$SENS" Error: Prefilter died createtsv phrogs_mmseqs_db/phrogs_profile_db mmseqs_target_seq/[bacteria_of_interest]/[bacteria_of_interest]_targetofinterest_proteins.target_seq mmseqs_target_seq/[bacteria_of_interest]/[bacteria_of_interest]_targetofinterest_proteins_mmseqs mmseqs_target_seq/[bacteria_of_interest]/[bacteria_of_interest]_targetofinterest_proteins_mmseqs.tsv --full-header

MMseqs Version: 14.7e284 First sequence as representative false Target column 1 Add full header true Sequence source 0 Database output false Threads 64 Compressed 0 Verbosity 3

No datafile could be found for mmseqs_target_seq/[bacteria_of_interest]/[bacteria_of_interest]_targetofinterest_proteins_mmseqs! cp: cannot stat 'mmseqs_target_seq/[bacteria_of_interest]/[bacteria_of_interest]_targetofinterest_proteins_mmseqs.tsv': No such file or directory file: mmseqs_target_seq/[bacteria_of_interest]_targetofinterest_proteins_mmseqs.tsv sample: [bacteria_of_interest] [bacteria_of_interest] slurmstepd: error: Detected 1 oom-kill event(s) in StepId=4226926.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.

Steps to Reproduce (for bugs)

Please make sure to execute the reproduction steps with newly recreated and empty tmp folders.

MMseqs Output (for bugs)

Please make sure to also post the complete output of MMseqs. You can use gist.github.com for large output.

Context

Providing context helps us come up with a solution and improve our documentation for the future.

Your Environment

Include as many relevant details about the environment you experienced the bug in.

Git commit used (The string after "MMseqs Version:" when you execute MMseqs without any parameters):
Which MMseqs version was used (Statically-compiled, self-compiled, Homebrew, etc.):
For self-compiled and Homebrew: Compiler and Cmake versions used and their invocation:
Server specifications (especially CPU support for AVX2/SSE and amount of system memory):
Operating system and version: MMseq version: 13.45111 CPU: 2x AMD 7543 (64 cores total) RAM: 512 GB Local Disk: 7 TB SSD Network: 100 GBit Infiniband Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 1 Core(s) per socket: 32 Socket(s): 1 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 49 Model name: AMD EPYC 7502P 32-Core Processor Stepping: 0 CPU MHz: 2500.000 CPU max MHz: 2500.0000 CPU min MHz: 1500.0000 BogoMIPS: 5000.22 Virtualization: AMD-V L1d cache: 32K L1i cache: 32K L2 cache: 512K L3 cache: 16384K NUMA node0 CPU(s): 0-31 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sme sev sev_es

soedinglab / MMseqs2

Error: Prefilter died #826