soedinglab / MMseqs2

MMseqs2: ultra fast and sensitive search and clustering suite
https://mmseqs.com
GNU General Public License v3.0
1.4k stars 194 forks source link

convertalis question #570

Open Valentin-Bio-zz opened 2 years ago

Valentin-Bio-zz commented 2 years ago

Hello I have performed a search using as query nucleotide multifasta and as a target an aminoacidic multifasta. The alignment is performed at aminoacidic level by translating the nucleotide multifasta.

So, when I run mmseqs convertalis and I retrieve a blastn output format 6 table, this table shows the information about the aminoacidic alignment?

I'm particularly parsing the 7th and 8th column of the table (start and end position of alignment of the query sequence) and want to be sure if I'm looking the start and end of the nucleotide or aminoacidic sequence.

Thanks for your time

milot-mirdita commented 2 years ago

The positions refer to the original sequences, in this case the query positions refer to nucleotide positions and the target positions refer to amino acid positions.

The alignment happens on the amino-acid level, after the alignment we recover the original positions with the offsetalignment module within the search workflow.