Closed jianshu93 closed 2 years ago
The alignment performed is a global alignment, not a semi-global alignment. However, the gap penalties are a bit special. The terminal gap penalties are by default very low. In vsearch, you can set individual penalties for the left, right and internal gaps, as well as for the query and target sequence. This makes it somewhat similar to semi-global alignment.
Thanks!
It is clear to me. And it is similar to usearch global even though usearch does not use a full dynamic programming.
Thanks, Jianshu
hello team:
vsearch --usearch_global T4AerOil.CoupledReads.fasta --db SILVA_138.1_SSURef_tax_silva_prok_nr.fasta --id 0.97 --blast6out vsearch_T4AerOil.blast6.txt --threads 24 --maxaccepts 1
usearch -usearch_global T4AerOil.CoupledReads.fasta -db SILVA_138.1_SSURef_tax_silva_prok_nr.fasta -id 0.97 -maxaccepts 1 -threads 24 -strand both -blast6out usearch.T4AerOil.blast6.new.txt
for the same input, I have 12 times different for output. Vsearch is 12 time more results found (usearch default maxaccept is 1)
Do I miss something?
Thanks,
Jianshu
@jianshu93 could you please provide a small fasta file so we can replicate your issue?
Here is a test for --maxaccepts
:
vsearch \
--usearch_global <(printf ">q\nAAAA\n") \
--db <(printf ">s1\nAAAA\n>s2\nAAAAA\n") \
--minseqlength 1 \
--id 0.50 \
--maxaccepts 1 \
--quiet \
--blast6out -
With --maxaccepts 1
, only s1
is matched. With --maxaccepts 2
, both s1 and s2 are matched.
Hello vsearch tem,
In the second phase of searching in the vsearch paper, I quote "The alignment is performed using a multi- threaded and vectorised full dynamic programming algorithm", is this global alignment or semi-global alignment? Because for a majority cases, we tend to search short queries against longer database sequences. I think in those cases, semi-global alignment will be more useful (no gap penalty at beginning and ends of both sequences, for example, amplicon v3-v4 against full length 16S).
Thanks,
Jianshu