Amalgamated likelihood estimation (ALE) is a probabilistic approach to exhaustively explore all reconciled gene trees that can be amalgamated as a combination of clades observed in a sample of gene trees. We implement the ALE approach in the context of a reconciliation model (cf. http://arxiv.org/abs/1211.4606 ), which allows for the duplication, transfer and loss of genes. We use ALE to efficiently approximate the sum of the joint likelihood over amalgamations and to find the reconciled gene tree that maximizes the joint likelihood among all such trees.
46
stars
15
forks
source link
How to parse ALEobserve output to retrieve CCP values #36
I would like to measure Conditional Clade Probabilities on a sample of trees; I thought that this is exactly what ALEobserve does, but need some explanations on the .ale output.
Would you mind confirming the following suppositions on the format?
To obtain the CCP of one clade x given its parent clade y, it seems that the right line will be in the #Dip_counts section; it has 4 columns, which I assume to be {parent clade}, {subclade1}, {subclade2}, {number of trees}.
Then the CCP is equal to the selected Dip_count divided by the Bip_count of the parent clade?
To convert the clade id into a set of leaves, I need to look into the #leaf-id and #set-id sections at the end: I can't figure out the coding here, how do we turn these numbers to leaf sets?
I have a subsidiary question regarding CCP definition: the paper of Höhna et al. 2012 is cited, but I have also read Larget 2013 which states that his definition is slightly different:
I note that this equation is similar to the conditional clade probability (CCP) formulas given in Höhna and Drummond (2012). The key difference is that in their formulas, the unnormalized probabilities for each tree are the product over all clades in the tree of the conditional clade probabilities of the form P(clade|parent of clade). In the approach in this article, the probability of each tree is calculated as the product over all parent clades in the tree of the conditional clade probabilities of the form P(all children clades|parent clade).
It seems to me that the latter definition is the one implemented in ALE, right?
Hi,
I would like to measure Conditional Clade Probabilities on a sample of trees; I thought that this is exactly what
ALEobserve
does, but need some explanations on the.ale
output.Would you mind confirming the following suppositions on the format?
#Dip_counts
section; it has 4 columns, which I assume to be {parent clade}, {subclade1}, {subclade2}, {number of trees}.#leaf-id
and#set-id
sections at the end: I can't figure out the coding here, how do we turn these numbers to leaf sets?I have a subsidiary question regarding CCP definition: the paper of Höhna et al. 2012 is cited, but I have also read Larget 2013 which states that his definition is slightly different:
It seems to me that the latter definition is the one implemented in ALE, right?
Thanks in advance!