revbayes / revbayes.archive

GNU General Public License v3.0
68 stars 37 forks source link

generic, time-irreversible rate matrices (?FreeBinary, ?FreeK) #84

Closed hoehna closed 9 years ago

hoehna commented 9 years ago

It would be nice to extend our rate matrices to irreversible ones. We actually do have some examples, FreeBinary and FreeK but those seem to have problems. FreeK isn't supported much and FreeBinary seems to have a bug in some analyses.

The script bellow gives a segfault:

Testing correlated evolution between antennal coiling and tyloids

in diplazontine parasitoid wasps

DNA partitions

data[1] <- readDiscreteCharacterData("28S.nex") data[2] <- readDiscreteCharacterData("CO1.nex") data[2].setCodonPartition(v(1,2)) data[3] <- readDiscreteCharacterData("CO1.nex") data[3].setCodonPartition(3)

n_parts = data.size() tot_nSites = 0 for (i in 1:n_parts) { n_sites[i] = data[i].nchar()[1] tot_nSites = tot_nSites + n_sites[i] }

partitioned GTR model for the molecular partitions

source("SubstitutionModel_Partitioned-GTR-G-I.Rev")

morphology / behaviour

coiling <- readDiscreteCharacterData("Coiling.nex") tyloids <- readDiscreteCharacterData("Tyloids.nex")

var 1: FreeBinary

pi_coiling ~ dnDirichlet(v(1,1)) moves[mi++] = mvSimplexElementScale(pi_coiling, alpha=10.0, tune=true, weight=2.0) Q_coiling := fnFreeBinary(pi_coiling)

pi_tyloids ~ dnDirichlet(v(1,1)) moves[mi++] = mvSimplexElementScale(pi_tyloids, alpha=10.0, tune=true, weight=2.0) Q_tyloids := fnFreeBinary(pi_tyloids)

var 2: F81

pi_coiling ~ dnDirichlet(v(1,1))

moves[mi++] = mvSimplexElementScale(pi_coiling, alpha=10.0, tune=true, weight=2.0)

Q_coiling := fnF81(pi_coiling)

pi_tyloids ~ dnDirichlet(v(1,1))

moves[mi++] = mvSimplexElementScale(pi_tyloids, alpha=10.0, tune=true, weight=2.0)

Q_tyloids := fnF81(pi_tyloids)

specify a rate multiplier for each partition, using a Dirichlet

include the morphology partitions!

PartMult ~ dnDirichlet( rep(1,n_parts + 2) ) moves[mi++] = mvSimplexElementScale(PartMult, alpha=1.0, tune=true, weight=2.0)

for (i in 1:n_parts) { part_rate_mult[i] := PartMult[i] * (tot_nSites + 2) / n_sites[i] }

part_rate_mult[n_parts +1] := PartMult[n_parts +1] * (tot_nSites + 2) part_rate_mult[n_parts +2] := PartMult[n_parts +2] * (tot_nSites + 2)

Tree model: uniform topology prior, exponential distribution on branch lengths

get some useful numbers

n_species <- data[1].ntaxa() tip_names <- data[1].names()

Specify a uniform topology prior

topology ~ dnUniformTopology(n_species, tip_names)

moves on the tree

moves[mi++] = mvNNI(topology, weight=10.0) moves[mi++] = mvSPR(topology, weight=10.0)

Specify a prior and moves on the branch lengths

create a random variable for each branch length using a 'for' loop

n_branches = n_species * 2 - 3

for (i in 1:n_branches) {

We use here the exponential distribution with rate 10.0 as the branch length prior

br_lens[i] ~ dnExponential(10.0)

Add a move for the branch length. We just take a simple scaling move since the value is a positive real number.

moves[mi++] = mvScale(br_lens[i], lambda=1, tune=true, weight=1.0) }

create a deterministic variable for monitoring purposes

TreeLength := sum(br_lens)

Build the tree by combining the topology with the branch length.

phylogeny := treeAssembly(topology, br_lens)

PhyloCTMC Models for each partition

phylo seqs for DNA

for (i in 1:n_parts){ phyloSeq[i] ~ dnPhyloCTMC(tree=phylogeny, Q=Q[i], branchRates=part_rate_mult[i], siteRates=norm_gamma_rates[i], pInv=pinvar[i], type="DNA") phyloSeq[i].clamp(data[i]) }

phylo seqs for morphology/behaviour

phyloSeq[n_parts+1] ~ dnPhyloCTMC(tree=phylogeny, Q=Q_tyloids, branchRates=part_rate_mult[n_parts+1], type="standard") phyloSeq[n_parts+1].clamp(data_tyloids)

phyloSeq[n_parts+2] ~ dnPhyloCTMC(tree=phylogeny, Q=Q_coiling, branchRates=part_rate_mult[n_parts+2], type="standard") phyloSeq[n_parts+2].clamp(data_coiling)

THE Model

mymodel = model(topology)

Customizing the Output

parFileName = "coiling_independent.log" treeFileName = "coiling_independent.trees"

monitor[1] = mnModel(filename=parFileName,printgen=1000, separator = " ") monitor[2] = mnFile(filename=treeFileName,printgen=1000, separator = " ", phylogeny) monitor[3] = mnScreen(printgen=1000)

MCMC settings and run

mymcmc = mcmc(mymodel, monitor, moves, moveschedule="single") mymcmc.run(generations=100000)

some summarizing

pt = readTrace(parFileName, delimiter=" ") pt

tt = readTreeTrace(treeFileName, "non-clock", " ") consensusTree(tt, "Diplazis_consensus.tre", 0.5)

mlandis commented 9 years ago

This code seems to work:

tree ~ dnBDP(lambda=1.,names=["A","B","C"],nTaxa=3,rootAge=1.)
pi ~ dnDirichlet(v(1,1))
Q := fnFreeBinary(pi)
M ~ dnPhyloCTMC(tree=tree,Q=Q)

mdl = model(M)

mv[1] = mvSimplexElementScale(pi) 
mn[1] = mnScreen(pi)

ch = mcmc(mdl,mv,mn)
ch.run(100)

Also, using code blocks will ensure RevBayes comments aren't interpreted as Markdown headers.

hoehna commented 9 years ago

Hi Michael:

if that issue is solved, then feel free to close it. I completely trust you with this.

Sebastian

On Thu, Dec 18, 2014 at 10:41 AM, Michael Landis notifications@github.com wrote:

This code seems to work:

tree ~ dnBDP(lambda=1.,names=["A","B","C"],nTaxa=3,rootAge=1.) pi ~ dnDirichlet(v(1,1)) Q := fnFreeBinary(pi) M ~ dnPhyloCTMC(tree=tree,Q=Q)

mdl = model(M)

mv[1] = mvSimplexElementScale(pi) mn[1] = mnScreen(pi)

ch = mcmc(mdl,mv,mn) ch.run(100)

Also, using code blocks will ensure RevBayes comments aren't interpreted as Markdown headers.

— Reply to this email directly or view it on GitHub https://github.com/revbayes/revbayes/issues/84#issuecomment-67533379.