NBISweden / MrBayes

MrBayes is a program for Bayesian inference and model choice across a wide range of phylogenetic and evolutionary models. For documentation and downloading the program, please see the home page:
http://NBISweden.github.io/MrBayes/
GNU General Public License v3.0
232 stars 79 forks source link

ASR protein reconstruction and posterior probability of 1 #297

Open kbaltazart opened 6 months ago

kbaltazart commented 6 months ago

I try to reconstruct ancestral sequence of protein. I use the parameters under (i cut protein for you to see better) :

begin data; dimensions ntax=4 nchar=1057; format datatype=protein missing=? gap=-;

matrix
[                                        10        20        30        40        50        60        70        80        90        100       110       120       130       140       150       160       170       180       190       200       210       220       230       240       250       260       270       280       290       300       310       320       330       340       350       360       370       380       390       400       410       420       430       440       450       460       470       480       490       500       510       520       530       540       550       560       570       580       590       600       610       620       630       640       650       660       670       680       690       700       710       720       730       740       750       760       770       780       790       800       810       820       830       840       850       860       870       880       890       900       910       920       930       940       950       960       970       980       990       1000      1010      1020      1030      1040      1050    ]
[                                        .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .         .       ] 
Elephas_maximus_indicus          QLSYGYDEKSAGGISV
Loxodonta_africana               QLSYGYDEKSAGGISV
Mammut_americanum                QLSYGYDEKSAGGISV
Trichechus_manatus               QLAYGYDEKSPGGISV
;

end;

begin mrbayes; outgroup Trichechus_manatus; prset aamodelpr=mixed; constraint Elephantiforme -1 = 1 2 3; constraint Elephantidae -1 = 1 2; prset topologypr = constraints(Elephantidae, Elephantiforme); report ancstates=yes; mcmc nchains = 4 ngen=30000; sumt; sump; end;

The program run and give me an output p.stats. However, the ancestral sequence for both clade is the same, with posterior probability of 1 for an amino-acid for a given site. The thing which is very strange is the fact that the ancestral sequence corresponds always to the last sequence in my matrice (even if it's the outgroup or not). I already tried to swap individuals, I have always an ancestral sequence corresponding to the last individual sequence.

Thank you for help.

mceminsky commented 3 months ago

I am having a similar issue - every node has the same ancestral sequence with posterior probability of 1 at each site, and the ancestral sequence is always identical to the last sequence in my nexus data file. Any help is greatly appreciated.

    partition ancstates = 1: 1-.;
lset applyto = (all) rates=invgamma;
prset aamodelpr=mixed;
constraint ; [several constraints here]
outgroup 155;
prset topologypr=constraints([list of constraints]);
report applyto = (1) ancstates = yes;
mcmc ngen=800000 samplefreq=500 printfreq=500 Diagnfreq=5000 temp=0.04 startparams=reset starttree=random;
sump;
sumt contype=allcompat;

end;

askrip4 commented 2 weeks ago

I am having a similar issue. I am setting my parameters based on example 2 in this repository: https://github.com/nylander/Parse_MrBayes_Ancestral_States. When I constrain my dataset to obtain the ancestral sequence for a specific node, I get the same sequence for two different nodes, and the sequences do not make sense for the extant proteins in their respective clades. My parameters are below:

begin mrbayes;

set autoclose=yes nowarn=yes;
prset aamodelpr = mixed;
Taxset MyTaxset = 10 11;
Constraint MyConstraint = MyTaxset;
Prset topologypr = constraints(MyConstraint);
Report ancstates=yes;
mcmc nruns=1 nchains=1 ngen=50000 samplefreq = 1000; 
sump;

end;

mceminsky commented 2 weeks ago

I was able to bypass the issue when I moved to the Windows version.