soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
MIT License
1.44k stars 195 forks source link

How to speed up large query db #422

Open LuukvDamme opened 3 years ago

LuukvDamme commented 3 years ago

Hello,

First of all thank you for making such an amazing program, secondly I was wondering if you could provide some advice on how to handle a very large query database. I have several terabytes that I would like to check against the nr. Currently I am using the easy-taxonomy workflow, I have loaded about 1/15th of my data as a proof of concept. However as you will see in the log below this will take quite some time. My main questions are: is this expected behaviour and how am I able to speed this up?

Current Behavior

LSBATCH: User input mmseqs easy-taxonomy ./sample.fastq.gz ./nr ./result ./tmp -s 0.5

MMseqs Version: 13.45111 ORF filter 0 ORF filter e-value 100 ORF filter sensitivity 2 LCA mode 3 Majority threshold 0.5 Vote mode 1 LCA ranks
Column with taxonomic lineage 0 Compressed 0 Threads 26 Verbosity 3 Taxon blacklist 12908:unclassified sequences,28384:other sequences Substitution matrix nucl:nucleotide.out,aa:blosum62.out Add backtrace false Alignment mode 0 Alignment mode 0 Allow wrapped scoring false E-value threshold 0.001 Seq. id. threshold 0 Min alignment length 0 Seq. id. mode 0 Alternative alignments 0 Coverage threshold 0 Coverage mode 0 Max sequence length 65535 Compositional bias 1 Max reject 2147483647 Max accept 2147483647 Include identical seq. id. false Preload mode 0 Pseudo count a 1 Pseudo count b 1.5 Score bias 0 Realign hits false Realign score bias -0.2 Realign max seqs 2147483647 Gap open cost nucl:5,aa:11 Gap extension cost nucl:2,aa:1 Zdrop 40 Seed substitution matrix nucl:nucleotide.out,aa:VTML80.out Sensitivity 0.5 k-mer length 0 k-score 2147483647 Alphabet size nucl:5,aa:21 Max results per query 300 Split database 0 Split mode 0 Split memory limit 0 Diagonal scoring true Exact k-mer matching 0 Mask residues 1 Mask lower case residues 0 Minimum diagonal score 15 Spaced k-mers 1 Spaced k-mer pattern
Local temporary path
Rescore mode 0 Remove hits by seq. id. and coverage false Sort results 0 Mask profile 1 Profile E-value threshold 0.001 Global sequence weighting false Allow deletions false Filter MSA 1 Maximum seq. id. threshold 0.9 Minimum seq. id. 0 Minimum score per column -20 Minimum coverage 0 Select N most diverse seqs 1000 Min codons in orf 30 Max codons in length 32734 Max orf gaps 2147483647 Contig start mode 2 Contig end mode 2 Orf start mode 1 Forward frames 1,2,3 Reverse frames 1,2,3 Translation table 1 Translate orf 0 Use all table starts false Offset of numeric ids 0 Create lookup 0 Add orf stop false Overlap between sequences 0 Sequence split mode 1 Header split mode 0 Chain overlapping alignments 0 Merge query 1 Search type 0 Search iterations 1 Start sensitivity 4 Search steps 1 Exhaustive search mode false Filter results during exhaustive search 0 Strand selection 1 LCA search mode false Disk space limit 0 MPI runner
Force restart with latest tmp false Remove temporary files true Report mode 0 Alignment format 0 Format alignment output query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits Database output false First sequence as representative false Target column 1 Add full header false Sequence source 0 Database type 0 Shuffle input database true Createdb mode 1 Write lookup file 0

skipped some parts of the log that took very little time

Query database size: 695256546 type: Aminoacid Target split mode. Searching through 6 splits Estimated memory consumption: 232G Target database size: 353572663 type: Aminoacid Process prefiltering step 1 of 6

Index table k-mer threshold: 180 at k-mer size 7 Index table: counting k-mers [=================================================================] 58.92M 1h 27m 43s 365ms Index table: Masked residues: 338212106 Index table: fill [=================================================================] 58.92M 2h 48m 44s 23ms Index statistics Entries: 10047647313 DB size: 67258 MB Avg k-mer size: 7.849724 Top 10 k-mers FSHAGSI 169128 AFRNNFW 161115 APMFPNN 145858 GGGWLLM 137963 NNSWLPS 137460 AHFMIMV 126820 MPMGGNW 126274 TMLDRNT 108816 TGTYPSS 94201 GDQYNVT 84229 Time for index table init: 4h 18m 41s 415ms k-mer similarity threshold: 180 Starting prefiltering scores calculation (step 1 of 6) Query db start 1 to 695256546 Target db start 1 to 58919300 [=================================================================] 695.26M 61h 14m 42s 623ms

2.307739 k-mers per position 1254 DB matches per sequence 0 overflows 0 queries produce too many hits (truncated result) 11 sequences passed prefiltering per query sequence 1 median result list length 275899073 sequences with 0 size result lists Time for merging to pref_0_tmp_0: 0h 16m 3s 814ms Time for merging to pref_0_tmp_0_tmp: 0h 26m 19s 322ms Process prefiltering step 2 of 6

Index table k-mer threshold: 180 at k-mer size 7 Index table: counting k-mers [=================================================================] 58.92M 1h 18m 46s 598ms Index table: Masked residues: 338371908 Index table: fill [===========================Terminated

milot-mirdita commented 3 years ago

Could you please include the parts that you cut too? They are important to understand what exactly is going on.

LuukvDamme commented 3 years ago

Cetrainly, I just had to edit some paths due to some data being private information.

easy-taxonomy /sample.fastq.gz /nr /result /tmp -s 0.5

MMseqs Version:                         13.45111
ORF filter                              0
ORF filter e-value                      100
ORF filter sensitivity                  2
LCA mode                                3
Majority threshold                      0.5
Vote mode                               1
LCA ranks                               
Column with taxonomic lineage           0
Compressed                              0
Threads                                 26
Verbosity                               3
Taxon blacklist                         12908:unclassified sequences,28384:other sequences
Substitution matrix                     nucl:nucleotide.out,aa:blosum62.out
Add backtrace                           false
Alignment mode                          0
Alignment mode                          0
Allow wrapped scoring                   false
E-value threshold                       0.001
Seq. id. threshold                      0
Min alignment length                    0
Seq. id. mode                           0
Alternative alignments                  0
Coverage threshold                      0
Coverage mode                           0
Max sequence length                     65535
Compositional bias                      1
Max reject                              2147483647
Max accept                              2147483647
Include identical seq. id.              false
Preload mode                            0
Pseudo count a                          1
Pseudo count b                          1.5
Score bias                              0
Realign hits                            false
Realign score bias                      -0.2
Realign max seqs                        2147483647
Gap open cost                           nucl:5,aa:11
Gap extension cost                      nucl:2,aa:1
Zdrop                                   40
Seed substitution matrix                nucl:nucleotide.out,aa:VTML80.out
Sensitivity                             0.5
k-mer length                            0
k-score                                 2147483647
Alphabet size                           nucl:5,aa:21
Max results per query                   300
Split database                          0
Split mode                              0
Split memory limit                      0
Diagonal scoring                        true
Exact k-mer matching                    0
Mask residues                           1
Mask lower case residues                0
Minimum diagonal score                  15
Spaced k-mers                           1
Spaced k-mer pattern                    
Local temporary path                    
Rescore mode                            0
Remove hits by seq. id. and coverage    false
Sort results                            0
Mask profile                            1
Profile E-value threshold               0.001
Global sequence weighting               false
Allow deletions                         false
Filter MSA                              1
Maximum seq. id. threshold              0.9
Minimum seq. id.                        0
Minimum score per column                -20
Minimum coverage                        0
Select N most diverse seqs              1000
Min codons in orf                       30
Max codons in length                    32734
Max orf gaps                            2147483647
Contig start mode                       2
Contig end mode                         2
Orf start mode                          1
Forward frames                          1,2,3
Reverse frames                          1,2,3
Translation table                       1
Translate orf                           0
Use all table starts                    false
Offset of numeric ids                   0
Create lookup                           0
Add orf stop                            false
Overlap between sequences               0
Sequence split mode                     1
Header split mode                       0
Chain overlapping alignments            0
Merge query                             1
Search type                             0
Search iterations                       1
Start sensitivity                       4
Search steps                            1
Exhaustive search mode                  false
Filter results during exhaustive search 0
Strand selection                        1
LCA search mode                         false
Disk space limit                        0
MPI runner                              
Force restart with latest tmp           false
Remove temporary files                  true
Report mode                             0
Alignment format                        0
Format alignment output                 query,target,fident,alnlen,mismatch,gapopen,qstart,qend,tstart,tend,evalue,bits
Database output                         false
First sequence as representative        false
Target column                           1
Add full header                         false
Sequence source                         0
Database type                           0
Shuffle input database                  true
Createdb mode                           1
Write lookup file                       0

createdb /sample.fastq.gz /tmp/7059426268546109220/query --dbtype 0 --shuffle 1 --createdb-mode 1 --write-lookup 0 --id-offset 0 --compressed 0 -v 3 

Shuffle database cannot be combined with --createdb-mode 0
We recompute with --shuffle 0
Converting sequences
Only uncompressed fasta files can be used with --createdb-mode 0.
We recompute with --createdb-mode 1.
Time for merging to query_h: 0h 0m 0s 26ms
Time for merging to query: 0h 0m 0s 24ms
[===================================================================================================    1 Mio. sequences processed
=================================================================================================== 2 Mio. sequences processed
=================================================================================================== 3 Mio. sequences processed
=================================================================================================== 4 Mio. sequences processed
=================================================================================================== 5 Mio. sequences processed
=================================================================================================== 6 Mio. sequences processed
=================================================================================================== 7 Mio. sequences processed
=================================================================================================== 8 Mio. sequences processed
=================================================================================================== 9 Mio. sequences processed
=================================================================================================== 10 Mio. sequences processed
=================================================================================================== 11 Mio. sequences processed
=================================================================================================== 12 Mio. sequences processed
=================================================================================================== 13 Mio. sequences processed
=================================================================================================== 14 Mio. sequences processed
=================================================================================================== 15 Mio. sequences processed
=================================================================================================== 16 Mio. sequences processed
=================================================================================================== 17 Mio. sequences processed
=================================================================================================== 18 Mio. sequences processed
=================================================================================================== 19 Mio. sequences processed
=================================================================================================== 20 Mio. sequences processed
=================================================================================================== 21 Mio. sequences processed
=================================================================================================== 22 Mio. sequences processed
=================================================================================================== 23 Mio. sequences processed
=================================================================================================== 24 Mio. sequences processed
=================================================================================================== 25 Mio. sequences processed
=================================================================================================== 26 Mio. sequences processed
=================================================================================================== 27 Mio. sequences processed
=================================================================================================== 28 Mio. sequences processed
=================================================================================================== 29 Mio. sequences processed
=================================================================================================== 30 Mio. sequences processed
=================================================================================================== 31 Mio. sequences processed
=================================================================================================== 32 Mio. sequences processed
=================================================================================================== 33 Mio. sequences processed
=================================================================================================== 34 Mio. sequences processed
=================================================================================================== 35 Mio. sequences processed
=================================================================================================== 36 Mio. sequences processed
=================================================================================================== 37 Mio. sequences processed
=================================================================================================== 38 Mio. sequences processed
=================================================================================================== 39 Mio. sequences processed
=================================================================================================== 40 Mio. sequences processed
=================================================================================================== 41 Mio. sequences processed
=================================================================================================== 42 Mio. sequences processed
=================================================================================================== 43 Mio. sequences processed
=================================================================================================== 44 Mio. sequences processed
=================================================================================================== 45 Mio. sequences processed
=================================================================================================== 46 Mio. sequences processed
=================================================================================================== 47 Mio. sequences processed
=================================================================================================== 48 Mio. sequences processed
=================================================================================================== 49 Mio. sequences processed
=================================================================================================== 50 Mio. sequences processed
=================================================================================================== 51 Mio. sequences processed
=================================================================================================== 52 Mio. sequences processed
=================================================================================================== 53 Mio. sequences processed
=================================================================================================== 54 Mio. sequences processed
=================================================================================================== 55 Mio. sequences processed
=================================================================================================== 56 Mio. sequences processed
=================================================================================================== 57 Mio. sequences processed
=================================================================================================== 58 Mio. sequences processed
=================================================================================================== 59 Mio. sequences processed
=================================================================================================== 60 Mio. sequences processed
=================================================================================================== 61 Mio. sequences processed
=================================================================================================== 62 Mio. sequences processed
=================================================================================================== 63 Mio. sequences processed
=================================================================================================== 64 Mio. sequences processed
=================================================================================================== 65 Mio. sequences processed
=================================================================================================== 66 Mio. sequences processed
=================================================================================================== 67 Mio. sequences processed
=================================================================================================== 68 Mio. sequences processed
=================================================================================================== 69 Mio. sequences processed
=================================================================================================== 70 Mio. sequences processed
=================================================================================================== 71 Mio. sequences processed
=================================================================================================== 72 Mio. sequences processed
=================================================================================================== 73 Mio. sequences processed
=================================================================================================== 74 Mio. sequences processed
=================================================================================================== 75 Mio. sequences processed
=================================================================================================== 76 Mio. sequences processed
=================================================================================================== 77 Mio. sequences processed
=================================================================================================== 78 Mio. sequences processed
=================================================================================================== 79 Mio. sequences processed
=================================================================================================== 80 Mio. sequences processed
=================================================================================================== 81 Mio. sequences processed
=================================================================================================== 82 Mio. sequences processed
=================================================================================================== 83 Mio. sequences processed
=================================================================================================== 84 Mio. sequences processed
=================================================================================================== 85 Mio. sequences processed
=================================================================================================== 86 Mio. sequences processed
=================================================================================================== 87 Mio. sequences processed
=================================================================================================== 88 Mio. sequences processed
=================================================================================================== 89 Mio. sequences processed
=================================================================================================== 90 Mio. sequences processed
=================================================================================================== 91 Mio. sequences processed
=================================================================================================== 92 Mio. sequences processed
=================================================================================================== 93 Mio. sequences processed
=================================================================================================== 94 Mio. sequences processed
=================================================================================================== 95 Mio. sequences processed
=================================================================================================== 96 Mio. sequences processed
=================================================================================================== 97 Mio. sequences processed
=================================================================================================== 98 Mio. sequences processed
=================================================================================================== 99 Mio. sequences processed
=================================================================================================== 100 Mio. sequences processed
=================================================================================================== 101 Mio. sequences processed
=================================================================================================== 102 Mio. sequences processed
=================================================================================================== 103 Mio. sequences processed
=================================================================================================== 104 Mio. sequences processed
=================================================================================================== 105 Mio. sequences processed
=================================================================================================== 106 Mio. sequences processed
=================================================================================================== 107 Mio. sequences processed
=================================================================================================== 108 Mio. sequences processed
=================================================================================================== 109 Mio. sequences processed
=================================================================================================== 110 Mio. sequences processed
=================================================================================================== 111 Mio. sequences processed
=================================================================================================== 112 Mio. sequences processed
=================================================================================================== 113 Mio. sequences processed
=================================================================================================== 114 Mio. sequences processed
=================================================================================================== 115 Mio. sequences processed
=================================================================================================== 116 Mio. sequences processed
=================================================================================================== 117 Mio. sequences processed
=================================================================================================== 118 Mio. sequences processed
=================================================================================================== 119 Mio. sequences processed
=================================================================================================== 120 Mio. sequences processed
=================================================================================================== 121 Mio. sequences processed
=================================================================================================== 122 Mio. sequences processed
=================================================================================================== 123 Mio. sequences processed
=================================================================================================== 124 Mio. sequences processed
=================================================================================================== 125 Mio. sequences processed
=================================================================================================== 126 Mio. sequences processed
=================================================================================================== 127 Mio. sequences processed
=================================================================================================== 128 Mio. sequences processed
=================================================================================================== 129 Mio. sequences processed
=================================================================================================== 130 Mio. sequences processed
=================================================================================================== 131 Mio. sequences processed
=================================================================================================== 132 Mio. sequences processed
=================================================================================================== 133 Mio. sequences processed
=================================================================================================== 134 Mio. sequences processed
=================================================================================================== 135 Mio. sequences processed
=================================================================================================== 136 Mio. sequences processed
=================================================================================================== 137 Mio. sequences processed
=================================================================================================== 138 Mio. sequences processed
=================================================================================================== 139 Mio. sequences processed
=================================================================================================== 140 Mio. sequences processed
=================================================================================================== 141 Mio. sequences processed
=================================================================================================== 142 Mio. sequences processed
=================================================================================================== 143 Mio. sequences processed
=================================================================================================== 144 Mio. sequences processed
=================================================================================================== 145 Mio. sequences processed
=================================================================================================== 146 Mio. sequences processed
=================================================================================================== 147 Mio. sequences processed
=================================================================================================== 148 Mio. sequences processed
=================================================================================================== 149 Mio. sequences processed
=================================================================================================== 150 Mio. sequences processed
=================================================================================================== 151 Mio. sequences processed
=================================================================================================== 152 Mio. sequences processed
=================================================================================================== 153 Mio. sequences processed
=================================================================================================== 154 Mio. sequences processed
=================================================================================================== 155 Mio. sequences processed
=================================================================================================== 156 Mio. sequences processed
=================================================================================================== 157 Mio. sequences processed
=================================================================================================== 158 Mio. sequences processed
=================================================================================================== 159 Mio. sequences processed
=================================================================================================== 160 Mio. sequences processed
=================================================================================================== 161 Mio. sequences processed
=================================================================================================== 162 Mio. sequences processed
=================================================================================================== 163 Mio. sequences processed
=================================================================================================== 164 Mio. sequences processed
=================================================================================================== 165 Mio. sequences processed
=================================================================================================== 166 Mio. sequences processed
=================================================================================================== 167 Mio. sequences processed
=================================================================================================== 168 Mio. sequences processed
=================================================================================================== 169 Mio. sequences processed
=================================================================================================== 170 Mio. sequences processed
=================================================================================================== 171 Mio. sequences processed
=================================================================================================== 172 Mio. sequences processed
=================================================================================================== 173 Mio. sequences processed
=================================================================================================== 174 Mio. sequences processed
=================================================================================================== 175 Mio. sequences processed
=================================================================================================== 176 Mio. sequences processed
=================================================================================================== 177 Mio. sequences processed
=================================================================================================== 178 Mio. sequences processed
=================================================================================================== 179 Mio. sequences processed
=================================================================================================== 180 Mio. sequences processed
=================================================================================================== 181 Mio. sequences processed
=================================================================================================== 182 Mio. sequences processed
=================================================================================================== 183 Mio. sequences processed
=================================================================================================== 184 Mio. sequences processed
=================================================================================================== 185 Mio. sequences processed
=================================================================================================== 186 Mio. sequences processed
=================================================================================================== 187 Mio. sequences processed
=================================================================================================== 188 Mio. sequences processed
=================================================================================================== 189 Mio. sequences processed
=================================================================================================== 190 Mio. sequences processed
=================================================================================================== 191 Mio. sequences processed
=================================================================================================== 192 Mio. sequences processed
====================================================
Time for merging to query_h: 0h 0m 0s 212ms
Time for merging to query: 0h 0m 0s 27ms
Database type: Nucleotide
Time for processing: 0h 8m 12s 710ms
Create directory /tmp/7059426268546109220/taxonomy_tmp
taxonomy /tmp/7059426268546109220/query /nr /tmp/7059426268546109220/result /tmp/7059426268546109220/taxonomy_tmp --tax-output-mode 2 -s 0.5 --split-mode 0 --remove-tmp-files 1 

extractorfs /tmp/7059426268546109220/query /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/orfs_aa --min-length 30 --max-length 32734 --max-gaps 2147483647 --contig-start-mode 2 --contig-end-mode 2 --orf-start-mode 1 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --translate 1 --use-all-table-starts 0 --id-offset 0 --create-lookup 0 --threads 26 --compressed 0 -v 3 

[=================================================================] 192.52M 22m 33s 393ms
Time for merging to orfs_aa_h: 0h 7m 19s 213ms
Time for merging to orfs_aa: 0h 8m 4s 740ms
Time for processing: 0h 47m 10s 767ms
Create directory /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/tmp_taxonomy
taxonomy /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/orfs_aa /nr /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/orfs_tax /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/tmp_taxonomy --tax-output-mode 2 --tax-lineage 0 --alignment-mode 1 -e 1 --max-rejected 5 --max-accept 30 -s 0.5 --split-mode 0 --spaced-kmer-mode 1 --min-length 30 --max-length 32734 --orf-start-mode 1 --remove-tmp-files 1 

Create directory /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/tmp_taxonomy/8588819485854123580/tmp_hsp1
search /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/orfs_aa /nr /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/tmp_taxonomy/8588819485854123580/first /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/tmp_taxonomy/8588819485854123580/tmp_hsp1 --alignment-mode 1 -e 1 --max-rejected 5 --max-accept 30 -s 0.5 --split-mode 0 --spaced-kmer-mode 1 --min-length 30 --max-length 32734 --orf-start-mode 1 --lca-search 1 --remove-tmp-files 1 

prefilter /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/orfs_aa /nr /tmp/7059426268546109220/taxonomy_tmp/13812531703396435525/tmp_taxonomy/8588819485854123580/tmp_hsp1/1723886274502240713/pref_0 --sub-mat nucl:nucleotide.out,aa:blosum62.out --seed-sub-mat nucl:nucleotide.out,aa:VTML80.out -k 0 --k-score 2147483647 --alph-size nucl:5,aa:21 --max-seq-len 65535 --max-seqs 300 --split 0 --split-mode 0 --split-memory-limit 0 -c 0 --cov-mode 0 --comp-bias-corr 1 --diag-score 1 --exact-kmer-matching 0 --mask 1 --mask-lower-case 0 --min-ungapped-score 15 --add-self-matches 0 --spaced-kmer-mode 1 --db-load-mode 0 --pca 1 --pcb 1.5 --threads 26 --compressed 0 -v 3 -s 0.5 

Query database size: 695256546 type: Aminoacid
Target split mode. Searching through 6 splits
Estimated memory consumption: 232G
Target database size: 353572663 type: Aminoacid
Process prefiltering step 1 of 6

Index table k-mer threshold: 180 at k-mer size 7 
Index table: counting k-mers
[=================================================================] 58.92M 1h 27m 43s 365ms
Index table: Masked residues: 338212106
Index table: fill
[=================================================================] 58.92M 2h 48m 44s 23ms
Index statistics
Entries:          10047647313
DB size:          67258 MB
Avg k-mer size:   7.849724
Top 10 k-mers
    FSHAGSI 169128
    AFRNNFW 161115
    APMFPNN 145858
    GGGWLLM 137963
    NNSWLPS 137460
    AHFMIMV 126820
    MPMGGNW 126274
    TMLDRNT 108816
    TGTYPSS 94201
    GDQYNVT 84229
Time for index table init: 4h 18m 41s 415ms
k-mer similarity threshold: 180
Starting prefiltering scores calculation (step 1 of 6)
Query db start 1 to 695256546
Target db start 1 to 58919300
[=================================================================] 695.26M 61h 14m 42s 623ms

2.307739 k-mers per position
1254 DB matches per sequence
0 overflows
0 queries produce too many hits (truncated result)
11 sequences passed prefiltering per query sequence
1 median result list length
275899073 sequences with 0 size result lists
Time for merging to pref_0_tmp_0: 0h 16m 3s 814ms
Time for merging to pref_0_tmp_0_tmp: 0h 26m 19s 322ms
Process prefiltering step 2 of 6

Index table k-mer threshold: 180 at k-mer size 7 
Index table: counting k-mers
[=================================================================] 58.92M 1h 18m 46s 598ms
Index table: Masked residues: 338371908
Index table: fill
[===========================Terminated
milot-mirdita commented 3 years ago

It seems like you accidentally defeated a speed-up mechanism by setting -s 0.5. By setting -s <= --orf-filter-s it deactivates this optimization. In this mode, we first do a very low sensitivity search to see if an extracted ORF can find anything at all in the target database, thus we can reject a lot of fragments that won't be able to contribute at all later.

You can try setting --orf-filter-s 1 instead and leave the default sensitivity.

LuukvDamme commented 3 years ago

Thank you for the quick response, I will test this soon and post the results when it is done