bpp / bpp

Bayesian analysis of genomic sequence data under the multispecies coalescent model
GNU Affero General Public License v3.0
92 stars 27 forks source link

Implement topological constraints in BPP #124

Open xflouris opened 4 years ago

xflouris commented 4 years ago

Implement a new option in the control file:

constfile = filename

for defining topological constraints for species tree inference. Constraints are defined using three keywords define, constraint and outgroup. Some examples below:

Suppose we have species A,B,C,D,E,F,O.

define coleoptera as (A,B,C);
define hymenoptera as (D,E,F);

# no constraint
constraint (A,B,C,D,E,F,O);

# sets A B and C to always be in the same clade, as well as D,E,F together.
constraint (((A,B,C),(D,E,F)),O);

# equivalent to the line above
constraint ((coleoptera,hymenoptera),O);

# Constraint for a fixed clade ((A,B),C) to always be sister to an unresolved clade (D,E,F) with outgroup O. 
constraint ((((A,B),C),(D,E,F)),O);

# sets O as the outgroup
outgroup O;

# also multiple constraints can be specified if they are not in conflicts with previous ones, e.g.:
constraint (A,B);     # constraint A and B to always be sister taxa
constraint (C,D);    # constraint C and D to always be sister taxa
xflouris commented 4 years ago

Fixed rules for outgroup rules. I previously stated that outgroup O; is equivalent to constraint (O,(A,B,C,D,E,F)); but that is not correct. Outgroup O indicates that O must always be a child of the root node, but does not impose constraints on other nodes on the tree. The user may then add additional constraints on the (A,B,C,D,E,F) clade.

xflouris commented 4 years ago

Changed constraints syntax for outgroup. The new syntax is

outgroup = comma-separated list of taxa ;

Suppose a starting ((A,B),(C,(D,(E,F))));

To set A B and C as the outgroup, specify:

outgroup = A,B,C;
xflouris commented 4 years ago

Changed outgroup definition such that outgroup A,B,C,D; now means the same as constraint cpl(A,B,C,D); where cpl indicates the complementary taxa.