AntonelliLab / raxmlGUI

A new graphical interface for RAxML
https://antonellilab.github.io/raxmlGUI/
GNU Affero General Public License v3.0
19 stars 4 forks source link

add support for RNA secondary structure #115

Open dsilvestro opened 4 years ago

dsilvestro commented 4 years ago

This means adding the possibility to load a test file with the structure which is then passed to RAxML using -S structure_file

From RAxML's manual:

Specifying secondary structure models for an RNA alignment works slightly differntly because we read in a plain RNA alignment and then need to tell RAxML by an additional text file that is passed via -S which RNA alignment sites need to be grouped together. We do this in a standard bracket notation written into a plain text file, e.g., our DNA test alignment has 60 sites, thus our secondary structure file needs to contain a string of 60 characters like this one:

..................((.......))............(.........)........

The '.' symbol indicates that this is just a normal RNA site while the brackets indicate stems. Evidently, the number of opening and closing brackets mus match. In addition, it is also possible to specify pseudo knots with additional symbols: <>[]{} for instance:

..................((.......)).......{....(....}....)........

In terms of models there are 6-state, 7-state and 16-state models for accommodating secondary structure that are specified via -A. Available models are: S6A, S6B, S6C, S6D, S6E, S7A, S7B, S7C, S7D, S7E, S7F, S16, S16A, S16B. The default is the GTR 16-state model (-A S16). In RAxML the same nomenclature as in PHASE is used, so please consult the phase manual for a nice and detailed description of these models.

For our small example datasets we would run a secondary structure analysis like this:

raxmlHPC -m GTRGAMMA -p 12345 -S secondaryStructure.txt -s dna.phy -n T26

A common question is whether secondary structure models can also be partitioned. This is presently not possible. However, you can partition the underlying RNA data, e.g., use two partitions for our DNA dataset as before. What RAxML will do internally though is to generate a third partition for secondary structure that does not take into account that distinct secondary structure site pairs may stem from different partitions of the alignment.

dsilvestro commented 2 years ago

any update about this?