LANL-Bioinformatics / PhaME

Given a reference, PhaME extracts SNPs from complete genomes, draft genomes and/or reads. Uses SNP multiple sequence alignment to construct a phylogenetic tree. Provides evolutionary analyses (genes under positive selection) using CDS SNPs.
GNU General Public License v3.0
31 stars 15 forks source link

Pal2nal ERROR: inconsistency #24

Closed LeoVincenzi closed 12 months ago

LeoVincenzi commented 1 year ago

Dear developers, I am writing you about an error obtained converting a .aln file of a multiple protein alignment (4 species) to the correspondent nucleotide sequence. The error is the following:

#---  ERROR: inconsistency between the following pep and nuc seqs  ---#
>CKAN_00088800
MGRKEQTDDKKRVEEVLHILKKQAPLTVKQEKFCNDACVERFLKSKGDNVKKAAKHLRSC
LSWRESIGTEHLIADEFSAELADGVAYVAGHDQEARPVMVLRIKQDYQKFHSQKIYIRLL
VFTLEVAIGSMSKNVDQFVLLLDASFFRSASAFLNLLLATLKIVSEYYPGRLHKAFVIDP
PSLFSCLWKGVRPFVDLSNVTMVVSSYDFEDTLDDTFLSYPRASSLRFDRSKIGSCSSSR
FSFTVSHLDSLKPWYLSFADTSSSSSSSSSSKVGPTISTSPSPSLLGPALISPLNARSFS
FASPAARTPRGSWPKASFPSTPQPPRTQHHHHQQQQPRTPRPSFLHSPATFFRKDCQVSS
RTDRCRESFFPFLKFYRRPYDEMGYRSMMRTPTWWPHLHRLSPAQPSLCHLPQVLKRTQN
LKSPKKKKKTIFFSFFFKFM
>CKAN_00088800
MGRKEQTDDKKRVEEVLHILKKQAPLTVKQEKFCNDACVERFLKSKGDNVKKAAKHLRSC
LSWRESIGTEHLIADEFSAELADGVAYVAGHDQEARPVMVLRIKQDYQKFHSQKIYIRLL
VFTLEVAIGSMSKNVDQFVLLLDASFFRSASAFLNLLLATLKIVSEYYPGRLHKAFVIDP
PSLFSCLWKGVRPFVDLSNVTMVVSSYDFEDTLDDTFLSYPRASSLRFDRSKIGSCSSSR
FSFTVSHLDSLKPWYLSFADTSSSSSSSSSSKVGPTISTSPSPSLLGPALISPLNARSFS
FASPAARTPRGSWPKASFPSTPQPPRTQHHHHQQQQPRTPRPSFLHSPATFFRKDCQVSS
RTDRCRESFFPFLKFYRRPYDEMGYRSMMRTPTWWPHLHRLSPAQPSLCHLPQVLKRTQN
LKSPKKKKKTIFFSFFFKFM

Run bl2seq (-p tblastn) or GeneWise to see the inconsistency.

It seems no conversion happened. I do not completely understand the problem and how to solve it. I have seen it is a common error on web, but I haven't found any clear solution about it. Thank in advance for your help. Leo

mshakya commented 1 year ago

hi, are you using the pal2nal script outside of phame or is this happening within a phame run?

LeoVincenzi commented 1 year ago

Hi @mshakya. I was using pal2nal.pl itself, outside of the phem pipeline.

VaninaTonzo commented 1 year ago

Hi @LeoVincenzi and @mshakya,

I have exactly the same error. In my case, first, I used translatorX to to get the aminoacid and nucleotide alignment from the same nt sequences, so how can be posible to have so many inconsistencies like this?

"#--- ERROR: inconsistency between the following pep and nuc seqs ---#".

Thanks in advance for any help or suggestion.

mshakya commented 12 months ago

Hi, we dont maintain a separate version of pal2nal and i dont think that tool has been supported for a long time. So, i am not exactly sure how can this be resolved. We had been thinking of replacing pal2nal with something else, but havent had a time to find the right tool. I am going to close the issue for now as it falls outside of phame, but will keep in mind for next iteration of phame.

sunriseTM commented 4 months ago

In my case, it is because my protein sequence used for MSA contains an in-frame stop which will be resolved as a gap in the MSA output, which will be then ignored by pal2nal analysis and finally cause the inconsistency between lengths of protein sequence and coding sequence. I replaced the inframe stop (*) to 'X' and succed, hope it helps!