soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
GNU General Public License v3.0
1.4k stars 195 forks source link

Resuming crashed job #160

Open adimil opened 5 years ago

adimil commented 5 years ago

I'm running a 3 iteration search job and it crushed towards the end of the prefiltering of the last iteration. I tried to run again the same command in order to resume from where I stopped, but it seems to me that it is starting from the beginning.

Expected Behavior

Start from:

Program call:
prefilter tmpC/10139724895635470572/profile_1 genes.db tmpC/10139724895635470572/pref_2 --sub-mat blosum62.out -s 5.7 -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 65535 --max-seqs 1000 --offset-result 0 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --no-preload 1 --pca 1 --pcb 1.5 --threads 48 -v 3 

Current Behavior

creates new folder inside the tmp folder and starts a new run

Program call:
prefilter geneC.db genes.db tmpC/13630618462368123119/pref_0 --sub-mat blosum62.out -s 5.7 -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 65535 --max-seqs 1000 --offset-result 0 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --no-preload 1 --pca 1 --pcb 1.5 --threads 56 -v 3 

Steps to Reproduce (for bugs)

#BSUB -R "rusage[mem=200000]"
#BSUB -a openmpi
#BSUB -J "GeneC"

module load MMseqs2/6-f5a1c

runner="mpirun" mmseqs search geneC.db genes.db geneC-v-all_3itr.db tmpC --no-preload --max-seqs 1000 --num-iterations 3
milot-mirdita commented 5 years ago

Are you sure you called mmseqs search with exactly the same mmseqs version and parameters (same filenames also)?

If anything changed in between the two calls MMseqs2 will not be able to reuse the previous run.

Btw what did the prefilter say when it crashed?

adimil commented 5 years ago

Yes, it is the same exact command. It crashed because I had a run time limit and the job was too long (I'm running on LSF platform).

milot-mirdita commented 5 years ago

Can you post the output of the second run? If its still available, the output of the first run would also be very useful to track down what might be going wrong.

adimil commented 5 years ago

Sure:

first run

Program call:
search geneC.db genes.db geneC-v-all_3itr.db tmpC --no-preload --max-seqs 1000 --num-iterations 3 

MMseqs Version:                                                             GITDIR-NOTFOUND
Sub Matrix                                                                  blosum62.out
Add backtrace                                                               true
Alignment mode                                                              2
E-value threshold                                                           0.001
Seq. Id Threshold                                                           0
Seq. Id. Mode                                                               0
Alternative alignments                                                      0
Coverage threshold                                                          0
Coverage Mode                                                               0
Max. sequence length                                                        65535
Max. results per query                                                      1000
Compositional bias                                                          1
Realign hit                                                                 false
Max Reject                                                                  2147483647
Max Accept                                                                  2147483647
Include identical Seq. Id.                                                  false
No preload                                                                  true
Pseudo count a                                                              1
Pseudo count b                                                              1.5
Score bias                                                                  0
Gap open cost                                                               11
Gap extension cost                                                          1
Threads                                                                     48
Verbosity                                                                   3
Sensitivity                                                                 5.7
K-mer size                                                                  0
K-score                                                                     2147483647
Alphabet size                                                               21
Offset result                                                               0
Split DB                                                                    0
Split mode                                                                  2
Split Memory Limit                                                          0
Diagonal Scoring                                                            1
Exact k-mer matching                                                        0
Mask Residues                                                               1
Minimum Diagonal score                                                      15
Spaced Kmer                                                                 1
Spaced k-mer pattern                                                        
Rescore mode                                                                0
Remove hits by seq.id. and coverage                                         false
Sort results                                                                0
In substitution scoring mode, performs global alignment along the diagonal  false
Mask profile                                                                1
Profile e-value threshold                                                   0.1
Use global sequence weighting                                               false
Filter MSA                                                                  1
Maximum sequence identity threshold                                         0.9
Minimum seq. id.                                                            0
Minimum score per column                                                    -20
Minimum coverage                                                            0
Select n most diverse seqs                                                  1000
Omit Consensus                                                              false
Min codons in orf                                                           30
Max codons in length                                                        32734
Max orf gaps                                                                2147483647
Contig start mode                                                           2
Contig end mode                                                             2
Orf start mode                                                              0
Forward Frames                                                              1,2,3
Reverse Frames                                                              1,2,3
Translation Table                                                           1
Use all table starts                                                        false
Offset of numeric ids                                                       0
Add Orf Stop                                                                false
Number search iterations                                                    3
Start sensitivity                                                           4
Search steps                                                                1
Run a seq-profile search in slice mode                                      0
Sets the MPI runner                                                         
Remove Temporary Files                                                      false

Tmp tmpC folder does not exist or is not a directory.
Created dir tmpC
Program call:
prefilter geneC.db genes.db tmpC/10139724895635470572/pref_0 --sub-mat blosum62.out -s 5.7 -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 65535 --max-seqs 1000 --offset-result 0 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --no-preload 1 --pca 1 --pcb 1.5 --threads 48 -v 3 

MMseqs Version:             GITDIR-NOTFOUND
Sub Matrix                  blosum62.out
Sensitivity                 5.7
K-mer size                  0
K-score                     2147483647
Alphabet size               21
Max. sequence length        65535
Max. results per query      1000
Offset result               0
Split DB                    0
Split mode                  2
Split Memory Limit          0
Coverage threshold          0
Coverage Mode               0
Compositional bias          1
Diagonal Scoring            1
Exact k-mer matching        0
Mask Residues               1
Minimum Diagonal score      15
Include identical Seq. Id.  false
Spaced Kmer                 1
No preload                  true
Pseudo count a              1
Pseudo count b              1.5
Spaced k-mer pattern        
Threads                     48
Verbosity                   3

Initialising data structures...
Using 48 threads.
Could not find precomputed index. Compute index.
Substitution matrices...
Use kmer size 7 and split 3 using Target split mode.
Needed memory (213955223732 byte) of total memory (243154317312 byte)
Target database: genes.db(Size: 135880714)
Query database type: Aminoacid
Target database type: Aminoacid
Time for init: 0h 0m 22s 525ms
Query database: geneC.db(size=1)
Process prefiltering step 1 of 3

Index table k-mer threshold: 99
Index table: counting k-mers...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
.....................................................
Index table: Masked residues: 178364514
Index table: fill...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
.....................................................
Index table: removing duplicate entries...
Index table init done.

DB statistic
Entries:         13212108470
DB Size:         89512650820 (byte)
Avg Kmer Size:   10.322
Top 10 Kmers
    SGQQRIA     185585
    GPGGKLL     145938
    GGQRVAR     97591
    YTGTGKG     82504
    LSGQQAI     66273
    GRFVVEV     62617
    PHLGGQR     52589
    RAEGRAV     52331
    ALGSGKS     51616
    LLGPGKT     41610
Min Kmer Size:   0
Empty list: 514072627

Time for index table init: 2h 42m 34s 457ms
k-mer similarity threshold: 99
k-mer match probability: 0

Starting prefiltering scores calculation (step 1 of 3)
Query db start  1 to 1
Target db start  1 to 44537750

4188 k-mers per position.
16623976 DB matches per sequence.
0 Overflows.
406 sequences passed prefiltering per query sequence.
Median result list size: 406
0 sequences with 0 size result lists.

Time for prefiltering scores calculation: 0h 0m 1s 385ms
Time for merging files: 0h 0m 0s 51ms
Sorting the results...  tmpC/10139724895635470572/pref_0_tmp_0_tmp .. Done
Time for merging files: 0h 0m 0s 11ms
Process prefiltering step 2 of 3

Index table k-mer threshold: 99
Index table: counting k-mers...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
..................
Index table: Masked residues: 139540524
Index table: fill...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
..................
Index table: removing duplicate entries...
Index table init done.

DB statistic
Entries:         13252671170
DB Size:         89756027020 (byte)
Avg Kmer Size:   10.3536
Top 10 Kmers
    SGQQRIA     205594
    GPGGKLL     164986
    GGQRVAR     107667
    LNAEAAG     81286
    GKTLRAG     78341
    GRFVVEV     76675
    LSGQQAI     71301
    RGAVAVR     70548
    RAEGRAV     65028
    ALGSGKS     57558
Min Kmer Size:   0
Empty list: 646421803

Time for index table init: 2h 59m 43s 648ms
k-mer similarity threshold: 99
k-mer match probability: 0

Starting prefiltering scores calculation (step 2 of 3)
Query db start  1 to 1
Target db start  44537751 to 89725981

4188 k-mers per position.
16627606 DB matches per sequence.
0 Overflows.
406 sequences passed prefiltering per query sequence.
Median result list size: 406
0 sequences with 0 size result lists.

Time for prefiltering scores calculation: 0h 0m 1s 160ms
Time for merging files: 0h 0m 0s 731ms
Sorting the results...  tmpC/10139724895635470572/pref_0_tmp_1_tmp .. Done
Time for merging files: 0h 0m 0s 8ms
Process prefiltering step 3 of 3

Index table k-mer threshold: 99
Index table: counting k-mers...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
................................................................................................... 46 Mio. sequences processed
...............
Index table: Masked residues: 131143401
Index table: fill...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
................................................................................................... 46 Mio. sequences processed
...............
Index table: removing duplicate entries...
Index table init done.

DB statistic
Entries:         13249970660
DB Size:         89739823960 (byte)
Avg Kmer Size:   10.3515
Top 10 Kmers
    SGQQRIA     191356
    GPGGKLL     159663
    GGQRVAR     102329
    GKTLRAG     75720
    LSGQQAI     67148
    GRFVVEV     58653
    ALGSGKS     52357
    RAEGRAV     49975
    EPSLDLR     44445
    GLGNGKS     44006
Min Kmer Size:   0
Empty list: 595147531

Time for index table init: 3h 5m 31s 6ms
k-mer similarity threshold: 99
k-mer match probability: 0

Starting prefiltering scores calculation (step 3 of 3)
Query db start  1 to 1
Target db start  89725982 to 135880714

4188 k-mers per position.
16598492 DB matches per sequence.
0 Overflows.
406 sequences passed prefiltering per query sequence.
Median result list size: 406
0 sequences with 0 size result lists.

Time for prefiltering scores calculation: 0h 0m 1s 147ms
Time for merging files: 0h 0m 0s 359ms
Sorting the results...  tmpC/10139724895635470572/pref_0_tmp_2_tmp .. Done
Time for merging files: 0h 0m 0s 13ms
Merge file tmpC/10139724895635470572/pref_0_tmp_0 and tmpC/10139724895635470572/pref_0.index_tmp_0
Time for merging files: 0h 0m 0s 8ms
tmpC/10139724895635470572/pref_0_merged tmpC/10139724895635470572/pref_0.index_merged
Time for merging files: 0h 0m 0s 295ms

Time for merging results: 0h 0m 0s 908ms
Time for processing: 8h 48m 29s 581ms
Program call:
align geneC.db genes.db tmpC/10139724895635470572/pref_0 tmpC/10139724895635470572/aln_0 --sub-mat blosum62.out -a 1 --alignment-mode 2 -e 0.1 --min-seq-id 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --max-seqs 1000 --comp-bias-corr 1 --realign 1 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --no-preload 1 --pca 1 --pcb 1.5 --score-bias 0 --gap-open 11 --gap-extend 1 --threads 48 -v 3 

MMseqs Version:             GITDIR-NOTFOUND
Sub Matrix                  blosum62.out
Add backtrace               true
Alignment mode              2
E-value threshold           0.1
Seq. Id Threshold           0
Seq. Id. Mode               0
Alternative alignments      0
Coverage threshold          0
Coverage Mode               0
Max. sequence length        65535
Max. results per query      1000
Compositional bias          1
Realign hit                 true
Max Reject                  2147483647
Max Accept                  2147483647
Include identical Seq. Id.  false
No preload                  true
Pseudo count a              1
Pseudo count b              1.5
Score bias                  0
Gap open cost               11
Gap extension cost          1
Threads                     48
Verbosity                   3

Init data structures...
Compute score only.
Using 1 threads.
Query database type: Aminoacid
Target database type: Aminoacid
Calculation of Smith-Waterman alignments.
Time for merging files: 0h 0m 0s 14ms

All sequences processed.

1218 alignments calculated.
266 sequence pairs passed the thresholds (0.218391 of overall calculated).
266 hits per query sequence.
Time for processing: 0h 0m 15s 821ms
Program call:
result2profile geneC.db genes.db tmpC/10139724895635470572/aln_0 tmpC/10139724895635470572/profile_0 --sub-mat blosum62.out --mask-profile 1 --e-profile 0.1 --comp-bias-corr 1 --wg 0 --filter-msa 1 --max-seq-id 0.9 --qid 0 --qsc -20 --cov 0 --diff 1000 --pca 0 --pcb 1.5 --omit-consensus 0 --no-preload 1 --gap-open 11 --gap-extend 1 --threads 48 -v 3 

MMseqs Version:                     GITDIR-NOTFOUND
Sub Matrix                          blosum62.out
Mask profile                        1
Profile e-value threshold           0.1
Compositional bias                  1
Use global sequence weighting       false
Filter MSA                          1
Maximum sequence identity threshold 0.9
Minimum seq. id.                    0
Minimum score per column            -20
Minimum coverage                    0
Select n most diverse seqs          1000
Pseudo count a                      0
Pseudo count b                      1.5
Omit Consensus                      false
No preload                          true
Gap open cost                       11
Gap extension cost                  1
Threads                             48
Verbosity                           3

Start computing profiles.
Query database type: Aminoacid
Target database type: Aminoacid
Time for merging files: 0h 0m 0s 8ms
Time for merging files: 0h 0m 0s 7ms

Done.
Time for processing: 0h 0m 15s 770ms
Program call:
prefilter tmpC/10139724895635470572/profile_0 genes.db tmpC/10139724895635470572/pref_1 --sub-mat blosum62.out -s 5.7 -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 65535 --max-seqs 1000 --offset-result 0 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --no-preload 1 --pca 1 --pcb 1.5 --threads 48 -v 3 

MMseqs Version:             GITDIR-NOTFOUND
Sub Matrix                  blosum62.out
Sensitivity                 5.7
K-mer size                  0
K-score                     2147483647
Alphabet size               21
Max. sequence length        65535
Max. results per query      1000
Offset result               0
Split DB                    0
Split mode                  2
Split Memory Limit          0
Coverage threshold          0
Coverage Mode               0
Compositional bias          1
Diagonal Scoring            1
Exact k-mer matching        0
Mask Residues               1
Minimum Diagonal score      15
Include identical Seq. Id.  false
Spaced Kmer                 1
No preload                  true
Pseudo count a              1
Pseudo count b              1.5
Spaced k-mer pattern        
Threads                     48
Verbosity                   3

Initialising data structures...
Using 48 threads.
Could not find precomputed index. Compute index.
Substitution matrices...
Use kmer size 7 and split 3 using Target split mode.
Needed memory (213441943732 byte) of total memory (243154317312 byte)
Target database: genes.db(Size: 135880714)
Query database type: Profile
Target database type: Aminoacid
Time for init: 0h 0m 15s 203ms
Query database: tmpC/10139724895635470572/profile_0(size=1)
Process prefiltering step 1 of 3

Index table k-mer threshold: 0
Index table: counting k-mers...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
.....................................................
Index table: Masked residues: 178364514
Index table: fill...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
.....................................................
Index table: removing duplicate entries...
Index table init done.

DB statistic
Entries:         13212108470
DB Size:         89512650820 (byte)
Avg Kmer Size:   10.322
Top 10 Kmers
    SGQQRIA     185585
    GPGGKLL     145938
    GGQRVAR     97591
    YTGTGKG     82504
    LSGQQAI     66273
    GRFVVEV     62617
    PHLGGQR     52589
    RAEGRAV     52331
    ALGSGKS     51616
    LLGPGKT     41610
Min Kmer Size:   0
Empty list: 514072627

Time for index table init: 2h 31m 51s 474ms
k-mer similarity threshold: 116
k-mer match probability: 0

Starting prefiltering scores calculation (step 1 of 3)
Query db start  1 to 1
Target db start  1 to 44537750

4991 k-mers per position.
17840740 DB matches per sequence.
0 Overflows.
406 sequences passed prefiltering per query sequence.
Median result list size: 406
0 sequences with 0 size result lists.

Time for prefiltering scores calculation: 0h 0m 1s 493ms
Time for merging files: 0h 0m 0s 33ms
Sorting the results...  tmpC/10139724895635470572/pref_1_tmp_0_tmp .. Done
Time for merging files: 0h 0m 0s 9ms
Process prefiltering step 2 of 3

Index table k-mer threshold: 0
Index table: counting k-mers...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
..................
Index table: Masked residues: 139540524
Index table: fill...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
..................
Index table: removing duplicate entries...
Index table init done.

DB statistic
Entries:         13252671170
DB Size:         89756027020 (byte)
Avg Kmer Size:   10.3536
Top 10 Kmers
    SGQQRIA     205594
    GPGGKLL     164986
    GGQRVAR     107667
    LNAEAAG     81286
    GKTLRAG     78341
    GRFVVEV     76675
    LSGQQAI     71301
    RGAVAVR     70548
    RAEGRAV     65028
    ALGSGKS     57558
Min Kmer Size:   0
Empty list: 646421803

Time for index table init: 2h 34m 14s 845ms
k-mer similarity threshold: 116
k-mer match probability: 0

Starting prefiltering scores calculation (step 2 of 3)
Query db start  1 to 1
Target db start  44537751 to 89725981

4991 k-mers per position.
17800547 DB matches per sequence.
0 Overflows.
406 sequences passed prefiltering per query sequence.
Median result list size: 406
0 sequences with 0 size result lists.

Time for prefiltering scores calculation: 0h 0m 1s 608ms
Time for merging files: 0h 0m 0s 28ms
Sorting the results...  tmpC/10139724895635470572/pref_1_tmp_1_tmp .. Done
Time for merging files: 0h 0m 0s 24ms
Process prefiltering step 3 of 3

Index table k-mer threshold: 0
Index table: counting k-mers...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
................................................................................................... 46 Mio. sequences processed
...............
Index table: Masked residues: 131143401
Index table: fill...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
................................................................................................... 46 Mio. sequences processed
...............
Index table: removing duplicate entries...
Index table init done.

DB statistic
Entries:         13249970660
DB Size:         89739823960 (byte)
Avg Kmer Size:   10.3515
Top 10 Kmers
    SGQQRIA     191356
    GPGGKLL     159663
    GGQRVAR     102329
    GKTLRAG     75720
    LSGQQAI     67148
    GRFVVEV     58653
    ALGSGKS     52357
    RAEGRAV     49975
    EPSLDLR     44445
    GLGNGKS     44006
Min Kmer Size:   0
Empty list: 595147531

Time for index table init: 2h 31m 35s 140ms
k-mer similarity threshold: 116
k-mer match probability: 0

Starting prefiltering scores calculation (step 3 of 3)
Query db start  1 to 1
Target db start  89725982 to 135880714

4991 k-mers per position.
17774316 DB matches per sequence.
0 Overflows.
406 sequences passed prefiltering per query sequence.
Median result list size: 406
0 sequences with 0 size result lists.

Time for prefiltering scores calculation: 0h 0m 1s 268ms
Time for merging files: 0h 0m 0s 19ms
Sorting the results...  tmpC/10139724895635470572/pref_1_tmp_2_tmp .. Done
Time for merging files: 0h 0m 0s 10ms
Merge file tmpC/10139724895635470572/pref_1_tmp_0 and tmpC/10139724895635470572/pref_1.index_tmp_0
Time for merging files: 0h 0m 0s 8ms
tmpC/10139724895635470572/pref_1_merged tmpC/10139724895635470572/pref_1.index_merged
Time for merging files: 0h 0m 0s 237ms

Time for merging results: 0h 0m 1s 496ms
Time for processing: 7h 38m 5s 42ms
Program call:
subtractdbs tmpC/10139724895635470572/pref_1 tmpC/10139724895635470572/aln_0 tmpC/10139724895635470572/pref_next_1 --threads 48 --e-profile 0.1 -v 3 

MMseqs Version:             GITDIR-NOTFOUND
Threads                     48
Profile e-value threshold   0.1
Verbosity                   3

Remove tmpC/10139724895635470572/aln_0 ids from tmpC/10139724895635470572/pref_1
Output databse: tmpC/10139724895635470572/pref_next_1
Time for merging files: 0h 0m 0s 250ms
Time for processing: 0h 0m 0s 861ms
Program call:
align tmpC/10139724895635470572/profile_0 genes.db tmpC/10139724895635470572/pref_1 tmpC/10139724895635470572/aln_1 --sub-mat blosum62.out -a 1 --alignment-mode 2 -e 0.1 --min-seq-id 0 --seq-id-mode 0 --alt-ali 0 -c 0 --cov-mode 0 --max-seq-len 65535 --max-seqs 1000 --comp-bias-corr 1 --realign 0 --max-rejected 2147483647 --max-accept 2147483647 --add-self-matches 0 --no-preload 1 --pca 1 --pcb 1.5 --score-bias 0 --gap-open 11 --gap-extend 1 --threads 48 -v 3 

MMseqs Version:             GITDIR-NOTFOUND
Sub Matrix                  blosum62.out
Add backtrace               true
Alignment mode              2
E-value threshold           0.1
Seq. Id Threshold           0
Seq. Id. Mode               0
Alternative alignments      0
Coverage threshold          0
Coverage Mode               0
Max. sequence length        65535
Max. results per query      1000
Compositional bias          1
Realign hit                 false
Max Reject                  2147483647
Max Accept                  2147483647
Include identical Seq. Id.  false
No preload                  true
Pseudo count a              1
Pseudo count b              1.5
Score bias                  0
Gap open cost               11
Gap extension cost          1
Threads                     48
Verbosity                   3

Init data structures...
Compute score, coverage and sequence id.
Using 1 threads.
Query database type: Profile
Target database type: Aminoacid
Calculation of Smith-Waterman alignments.
Time for merging files: 0h 0m 0s 8ms

All sequences processed.

952 alignments calculated.
258 sequence pairs passed the thresholds (0.271008 of overall calculated).
258 hits per query sequence.
Time for processing: 0h 0m 15s 798ms
Program call:
mergedbs tmpC/10139724895635470572/profile_0 tmpC/10139724895635470572/aln_new tmpC/10139724895635470572/aln_0 tmpC/10139724895635470572/aln_1 

MMseqs Version: GITDIR-NOTFOUND
Merge prefixes  
Verbosity       3

Merging the results to tmpC/10139724895635470572/aln_new
Done
Time for merging files: 0h 0m 0s 7ms
Time for processing: 0h 0m 0s 23ms
Program call:
result2profile tmpC/10139724895635470572/profile_0 genes.db tmpC/10139724895635470572/aln_0 tmpC/10139724895635470572/profile_1 --sub-mat blosum62.out --mask-profile 1 --e-profile 0.1 --comp-bias-corr 1 --wg 0 --filter-msa 1 --max-seq-id 0.9 --qid 0 --qsc -20 --cov 0 --diff 1000 --pca 0 --pcb 1.5 --omit-consensus 0 --no-preload 1 --gap-open 11 --gap-extend 1 --threads 48 -v 3 

MMseqs Version:                     GITDIR-NOTFOUND
Sub Matrix                          blosum62.out
Mask profile                        1
Profile e-value threshold           0.1
Compositional bias                  1
Use global sequence weighting       false
Filter MSA                          1
Maximum sequence identity threshold 0.9
Minimum seq. id.                    0
Minimum score per column            -20
Minimum coverage                    0
Select n most diverse seqs          1000
Pseudo count a                      0
Pseudo count b                      1.5
Omit Consensus                      false
No preload                          true
Gap open cost                       11
Gap extension cost                  1
Threads                             48
Verbosity                           3

Start computing profiles.
Query database type: Profile
Target database type: Aminoacid
Time for merging files: 0h 0m 0s 112ms
Time for merging files: 0h 0m 0s 7ms

Done.
Time for processing: 0h 0m 16s 74ms
Program call:
prefilter tmpC/10139724895635470572/profile_1 genes.db tmpC/10139724895635470572/pref_2 --sub-mat blosum62.out -s 5.7 -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 65535 --max-seqs 1000 --offset-result 0 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --no-preload 1 --pca 1 --pcb 1.5 --threads 48 -v 3 

MMseqs Version:             GITDIR-NOTFOUND
Sub Matrix                  blosum62.out
Sensitivity                 5.7
K-mer size                  0
K-score                     2147483647
Alphabet size               21
Max. sequence length        65535
Max. results per query      1000
Offset result               0
Split DB                    0
Split mode                  2
Split Memory Limit          0
Coverage threshold          0
Coverage Mode               0
Compositional bias          1
Diagonal Scoring            1
Exact k-mer matching        0
Mask Residues               1
Minimum Diagonal score      15
Include identical Seq. Id.  false
Spaced Kmer                 1
No preload                  true
Pseudo count a              1
Pseudo count b              1.5
Spaced k-mer pattern        
Threads                     48
Verbosity                   3

Initialising data structures...
Using 48 threads.
Could not find precomputed index. Compute index.
Substitution matrices...
Use kmer size 7 and split 3 using Target split mode.
Needed memory (213441943732 byte) of total memory (243154317312 byte)
Target database: genes.db(Size: 135880714)
Query database type: Profile
Target database type: Aminoacid
Time for init: 0h 0m 15s 188ms
Query database: tmpC/10139724895635470572/profile_1(size=1)
Process prefiltering step 1 of 3

Index table k-mer threshold: 0
Index table: counting k-mers...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
.....................................................
Index table: Masked residues: 178364514
Index table: fill...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
.....................................................
Index table: removing duplicate entries...
Index table init done.

DB statistic
Entries:         13212108470
DB Size:         89512650820 (byte)
Avg Kmer Size:   10.322
Top 10 Kmers
    SGQQRIA     185585
    GPGGKLL     145938
    GGQRVAR     97591
    YTGTGKG     82504
    LSGQQAI     66273
    GRFVVEV     62617
    PHLGGQR     52589
    RAEGRAV     52331
    ALGSGKS     51616
    LLGPGKT     41610
Min Kmer Size:   0
Empty list: 514072627

Time for index table init: 2h 32m 52s 237ms
k-mer similarity threshold: 116
k-mer match probability: 0

Starting prefiltering scores calculation (step 1 of 3)
Query db start  1 to 1
Target db start  1 to 44537750

4286 k-mers per position.
16240974 DB matches per sequence.
0 Overflows.
406 sequences passed prefiltering per query sequence.
Median result list size: 406
0 sequences with 0 size result lists.

Time for prefiltering scores calculation: 0h 0m 1s 225ms
Time for merging files: 0h 0m 0s 13ms
Sorting the results...  tmpC/10139724895635470572/pref_2_tmp_0_tmp .. Done
Time for merging files: 0h 0m 0s 7ms
Process prefiltering step 2 of 3

Index table k-mer threshold: 0
Index table: counting k-mers...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
..................
Index table: Masked residues: 139540524
Index table: fill...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
..................
Index table: removing duplicate entries...
Index table init done.

DB statistic
Entries:         13252671170
DB Size:         89756027020 (byte)
Avg Kmer Size:   10.3536
Top 10 Kmers
    SGQQRIA     205594
    GPGGKLL     164986
    GGQRVAR     107667
    LNAEAAG     81286
    GKTLRAG     78341
    GRFVVEV     76675
    LSGQQAI     71301
    RGAVAVR     70548
    RAEGRAV     65028
    ALGSGKS     57558
Min Kmer Size:   0
Empty list: 646421803

Time for index table init: 3h 5m 17s 508ms
k-mer similarity threshold: 116
k-mer match probability: 0

Starting prefiltering scores calculation (step 2 of 3)
Query db start  1 to 1
Target db start  44537751 to 89725981

4286 k-mers per position.
16023100 DB matches per sequence.
0 Overflows.
406 sequences passed prefiltering per query sequence.
Median result list size: 406
0 sequences with 0 size result lists.

Time for prefiltering scores calculation: 0h 0m 1s 154ms
Time for merging files: 0h 0m 0s 360ms
Sorting the results...  tmpC/10139724895635470572/pref_2_tmp_1_tmp .. Done
Time for merging files: 0h 0m 0s 9ms
Process prefiltering step 3 of 3

Index table k-mer threshold: 0
Index table: counting k-mers...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
................................................................................................... 23 Mio. sequences processed
................................................................................................... 24 Mio. sequences processed
................................................................................................... 25 Mio. sequences processed
................................................................................................... 26 Mio. sequences processed
................................................................................................... 27 Mio. sequences processed
................................................................................................... 28 Mio. sequences processed
................................................................................................... 29 Mio. sequences processed
................................................................................................... 30 Mio. sequences processed
................................................................................................... 31 Mio. sequences processed
................................................................................................... 32 Mio. sequences processed
................................................................................................... 33 Mio. sequences processed
................................................................................................... 34 Mio. sequences processed
................................................................................................... 35 Mio. sequences processed
................................................................................................... 36 Mio. sequences processed
................................................................................................... 37 Mio. sequences processed
................................................................................................... 38 Mio. sequences processed
................................................................................................... 39 Mio. sequences processed
................................................................................................... 40 Mio. sequences processed
................................................................................................... 41 Mio. sequences processed
................................................................................................... 42 Mio. sequences processed
................................................................................................... 43 Mio. sequences processed
................................................................................................... 44 Mio. sequences processed
................................................................................................... 45 Mio. sequences processed
................................................................................................... 46 Mio. sequences processed
...............
Index table: Masked residues: 131143401
Index table: fill...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
................................................................................................... 17 Mio. sequences processed
................................................................................................... 18 Mio. sequences processed
................................................................................................... 19 Mio. sequences processed
................................................................................................... 20 Mio. sequences processed
................................................................................................... 21 Mio. sequences processed
................................................................................................... 22 Mio. sequences processed
.....................User defined signal 2

Second run:

call:
search geneC.db genes.db geneC-v-all_3itr.db tmpC --no-preload --max-seqs 1000 --num-iterations 3 

MMseqs Version:                                                             GITDIR-NOTFOUND
Sub Matrix                                                                  blosum62.out
Add backtrace                                                               true
Alignment mode                                                              2
E-value threshold                                                           0.001
Seq. Id Threshold                                                           0
Seq. Id. Mode                                                               0
Alternative alignments                                                      0
Coverage threshold                                                          0
Coverage Mode                                                               0
Max. sequence length                                                        65535
Max. results per query                                                      1000
Compositional bias                                                          1
Realign hit                                                                 false
Max Reject                                                                  2147483647
Max Accept                                                                  2147483647
Include identical Seq. Id.                                                  false
No preload                                                                  true
Pseudo count a                                                              1
Pseudo count b                                                              1.5
Score bias                                                                  0
Gap open cost                                                               11
Gap extension cost                                                          1
Threads                                                                     56
Verbosity                                                                   3
Sensitivity                                                                 5.7
K-mer size                                                                  0
K-score                                                                     2147483647
Alphabet size                                                               21
Offset result                                                               0
Split DB                                                                    0
Split mode                                                                  2
Split Memory Limit                                                          0
Diagonal Scoring                                                            1
Exact k-mer matching                                                        0
Mask Residues                                                               1
Minimum Diagonal score                                                      15
Spaced Kmer                                                                 1
Spaced k-mer pattern                                                        
Rescore mode                                                                0
Remove hits by seq.id. and coverage                                         false
Sort results                                                                0
In substitution scoring mode, performs global alignment along the diagonal  false
Mask profile                                                                1
Profile e-value threshold                                                   0.1
Use global sequence weighting                                               false
Filter MSA                                                                  1
Maximum sequence identity threshold                                         0.9
Minimum seq. id.                                                            0
Minimum score per column                                                    -20
Minimum coverage                                                            0
Select n most diverse seqs                                                  1000
Omit Consensus                                                              false
Min codons in orf                                                           30
Max codons in length                                                        32734
Max orf gaps                                                                2147483647
Contig start mode                                                           2
Contig end mode                                                             2
Orf start mode                                                              0
Forward Frames                                                              1,2,3
Reverse Frames                                                              1,2,3
Translation Table                                                           1
Use all table starts                                                        false
Offset of numeric ids                                                       0
Add Orf Stop                                                                false
Number search iterations                                                    3
Start sensitivity                                                           4
Search steps                                                                1
Run a seq-profile search in slice mode                                      0
Sets the MPI runner                                                         
Remove Temporary Files                                                      false

Program call:
prefilter geneC.db genes.db tmpC/13630618462368123119/pref_0 --sub-mat blosum62.out -s 5.7 -k 0 --k-score 2147483647 --alph-size 21 --max-seq-len 65535 --max-seqs 1000 --offset-result 0 --split 0 --split-mode 2 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --no-preload 1 --pca 1 --pcb 1.5 --threads 56 -v 3 

MMseqs Version:             GITDIR-NOTFOUND
Sub Matrix                  blosum62.out
Sensitivity                 5.7
K-mer size                  0
K-score                     2147483647
Alphabet size               21
Max. sequence length        65535
Max. results per query      1000
Offset result               0
Split DB                    0
Split mode                  2
Split Memory Limit          0
Coverage threshold          0
Coverage Mode               0
Compositional bias          1
Diagonal Scoring            1
Exact k-mer matching        0
Mask Residues               1
Minimum Diagonal score      15
Include identical Seq. Id.  false
Spaced Kmer                 1
No preload                  true
Pseudo count a              1
Pseudo count b              1.5
Spaced k-mer pattern        
Threads                     56
Verbosity                   3

Initialising data structures...
Using 56 threads.
Could not find precomputed index. Compute index.
Substitution matrices...
Use kmer size 7 and split 3 using Target split mode.
Needed memory (230985702428 byte) of total memory (364787254886 byte) Target database: genes.db(Size: 135880714) Query database type: Aminoacid Target database type: Aminoacid Time for init: 0h 0m 21s 978ms Query database: geneC.db(size=1) Process prefiltering step 1 of 3

Index table k-mer threshold: 99
Index table: counting k-mers...
................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
.........
milot-mirdita commented 5 years ago

So I figured out the puzzle. All parameters have to match, even ones that MMseqs2 resolves automatically such as thread count. The first run was with 48, the other with 56 threads. If you resubmit the job, such that it runs on the 48 thread machine again it should continue from the point it stopped correctly.

@martin-steinegger: should we make a list of parameters to always exclude from hashing (threads, verbosity, ...)?

martin-steinegger commented 5 years ago

Maybe we should add a do not hash flags to the parameters to avoid this.

milot-mirdita commented 5 years ago

That’s what I was thinking about, but what can go wrong? First run 56 and second run 48 threads, would not delete split files from .47 to .55. Anything else that can go bad?