Pas-Kapli / mptp

mPTP - a tool for single-locus species delimitation
GNU Affero General Public License v3.0
25 stars 5 forks source link

Add support values to nodes after bayesian runs #32

Closed xflouris closed 9 years ago

xflouris commented 9 years ago

Add support values to nodes after a Bayesian run, that indicate how many times a node was marked as speciation.

xflouris commented 9 years ago

Here is a description of a constant-time algorithm for computing the support values during the bayesian runs. Assume the bayesian run consists of N MCMC steps and we use a tree T = (V,E), where V are the nodes and E the edges. This following methods updates the support values at each MCMC step in constant-time instead of the naive method that updates all nodes, i.e. O(|V|).

It requires two arrays:

-1 -1 0 -1 0 -1 ... -1
- Array count of size V as well.
0 0 0 0 0 0 ... 0

Each element of the arrays represents one particular node in the tree.

Initialization

  1. Generate an initial delimitation.
  2. Set the elements of array history to -1 for those nodes that are part of the coalescent, and to 0 for the ones that are part of speciation.
  3. Set all elements of array count to 0.

MCMC step i

During step i of MCMC, a node u can change in one of the two following ways

From speciation becomes part of coalescent :
count[u] = count[u] + i - history[u]
history[u] = -1
From coalescent becomes part of speciation
history[u] = i

End of MCMC

for each u in V
  if history[u] != -1 then
    count[u] = count[u] + N - history[u]

Result

Array count contains the number each node was part of the speciation process, and hence, for a node u we can deduce a probability count[u] / N that indicates if that node is part of the speciation event.

xflouris commented 9 years ago

finished.