brettc / partitionfinder

PartitionFinder discovers optimal partitioning schemes for DNA sequences.
Other
61 stars 44 forks source link

Fix RAxML warning message about GTR+G and GTR+I+G #107

Closed pbfrandsen closed 8 years ago

pbfrandsen commented 8 years ago

A user at SI was confused by the following message because his partitioning scheme, included GTR models without a model of rate heterogeneity and because the examples referred to protein analyses (his were nt):

"Warning: RAxML allows for only a single model of rate heterogeneity in partitioned analyses. I.e. all partitions must be assigned either a +G model or a +I+G model. If the best models for your datasetcontain both types of model, you will need to choose an appropriate rate heterogeneity model when you run RAxML. This is specified through the command line to RAxML. To rigorously choose the best model, run two further PF analyses (these will be fast), fixing the partitioning scheme to this scheme and 'search=user;', in both. In one run, use only +I+G models ('models = all_protein_gammaI'); in the next, use only +G models ('models = all_protein_gamma;''). Choose the scheme with the lowest AIC/AICc/BIC score. Note that these re-runs will be quick!"

We should add that RAxML supports GTR and put something in there so that users will know that it applies to nucleotide data as well.

roblanf commented 8 years ago

Thanks Paul.

Done. I'm still not convinced it's much better, but unless I write something ridiculously long, it's hard to convey exactly what should be done...

https://github.com/brettc/partitionfinder/commit/98a58e0987e863b9c7cc83e5be0922e63d98842e

cmayer commented 8 years ago

I made some minor modifications that make this a bit more clear I think. I did understand the text as it was, but I have more background than most other users.

"Warning: RAxML allows for only a single model of rate heterogeneity in partitioned analyses. I.e. all partitions must be assigned either a +G model or a +I+G model. If the best partitioning scheme for your dataset contain both types of model, you will need to choose a single rate heterogeneity model when you run RAxML with the best partitioning scheme. The rate heterogeneity is specified through the command line to RAxML. To test whether +G or +G+I models should be preferred, run two further PF analyses (these will be fast), fixing the partitioning scheme to the best scheme that has already been found by specifying 'search=user;' in the parameter file in both runs. In one run, use only +I+G models ('models = all_protein_gammaI' or …. nuc-equivalent); in the next, use only +G models ('models = all_protein_gamma;' or nuc-equivalent'). Choose the scheme with the lowest AIC/AICc/BIC score. Note that these re-runs will be quick!"

Best Christoph

Am 02.06.2016 um 17:28 schrieb Paul Frandsen notifications@github.com:

"Warning: RAxML allows for only a single model of rate heterogeneity in partitioned analyses. I.e. all partitions must be assigned either a +G model or a +I+G model. If the best models for your datasetcontain both types of model, you will need to choose an appropriate rate heterogeneity model when you run RAxML. This is specified through the command line to RAxML. To rigorously choose the best model, run two further PF analyses (these will be fast), fixing the partitioning scheme to this scheme and 'search=user;', in both. In one run, use only +I+G models ('models = all_protein_gammaI'); in the next, use only +G models ('models = all_protein_gamma;''). Choose the scheme with the lowest AIC/AICc/BIC score. Note that these re-runs will be quick!"


Dr. Christoph Mayer Email: c.mayer.zfmk@uni-bonn.de Tel.: +49 (0)228 9122 403

Zoologisches Forschungsmuseum Alexander Koenig

Stiftung des öffentlichen Rechts; Direktor: Prof. J. W. Wägele Sitz: Bonn