Open liamxg opened 11 months ago
Hello, I have also encountered the same problem, have you solved it? How should I set up the “JTT+I+G+F” model?
@Zshuyun sorry, no one reply to me.
Okay, thank you
Dear @liamxg
From the help on lset
:
Nst -- Sets the number of substitution types: "1" constrains all of
the rates to be the same (e.g., a JC69 or F81 model); "2" all-
ows transitions and transversions to have potentially different
rates (e.g., a K80 or HKY85 model); "6" allows all rates to
be different, subject to the constraint of time-reversibility
(e.g., a GTR model). Finally, 'nst' can be set to 'mixed', which
results in the Markov chain sampling over the space of all poss-
ible reversible substitution models, including the GTR model and
all models that can be derived from it model by grouping the six
rates in various combinations. This includes all the named models
above and a large number of others, with or without name.
For a nt "4-by-4" setup, you specify the number of substitution types with
lset nst=
, choosing one of the options 1
, 2
, 6
, or Mixed
. Setting nst=1
means AC=AG=AT=CG=CT=GT, and nst=6
AC,CG,AT,GT,AG,CT. Using nst=2
will
set AC=AT=CG=GT,AG=CT. "TVM" would be AC,CG,AT,GT,AG=CT, but you can not
specify this specific rate configuration in MrBayes (no nst=5
for example).
However, one may try to "emulate" a TVM model, by setting lset nst=6
, then
use the prset
command to change to a highly informative prior for the
substitution rates (Revmatpr
). From the help on prset
:
Revmatpr -- This parameter sets the prior for the substitution rates
of the GTR model for nucleotide data. The options are:
prset revmatpr = dirichlet(<number>,<number>,...,<number>)
prset revmatpr = fixed(<number>,<number>,...,<number>)
The program assumes that the six substitution rates
are independent gamma-distributed random variables with the
same scale parameter when dirichlet is selected. The six
numbers in brackets each corresponds to a particular substi-
tution type. Together, they determine the shape of the prior
The six rates are in the order A<->C, A<->G, A<->T, C<->G,
C<->T, and G<->T. If you want an uninformative prior you can
use dirichlet(1,1,1,1,1,1), also referred to as a 'flat'
Dirichlet. This is the default setting. If you wish a prior
where the C<->T rate is 5 times and the A<->G rate 2 times
higher, on average, than the transversion rates, which are
all the same, then you should use a prior of the form
dirichlet(x,2x,x,x,5x,x), where x determines how much the
prior is focused on these particular rates. For more info,
see tratiopr. The fixed option allows you to fix the substi-
tution rates to particular values.
"+F" is probably the syntax used in iqtree2 for applying "Empirically counted
frequencies from alignment" when estimating the state frequencies. MrBayes
uses MCMC to integrate over all possible state frequencies, and the settings
for this can be changed with the prset Statefreqpr
command (see output from
help prset
).
"+R3" is probably the syntax used in iqtree2 for applying "the FreeRate model
with 3 categories" for modelling rate heterogeneity among sites. In MrBayes
(v3.2.7a), a "FreeRate"-model can be applied by using lset rates=kmixture
.
See the output from help lset
.
Currently, the models in MrBayes (v3.2.7a) are not set up to (easily) combine
the +I
(or +G
) with +Rn
.
Due to the fact that different software implements different models, some software have made program-specific subsets available for easier comparison (e.g., MrModeltest2, Modeltest-NG, IQ-tree, ...). These can be useful for many purposes.
Yours
Johan
@pontus @viklund @olas @eryl @msuchard