Does LikelihoodType.BEST make any sense at all, if we use BIC to score networks?

lutteropp commented 3 years ago

In this likelihood model, we only use the loglikelihood of a single displayed tree in the network, made smaller by multiplying it by the probability of that displayed tree. This kinda tells us that LikelihoodModel.BEST NEVER makes sense in phylogenetic network inference if we use BIC. Because then obviously a tree will win nearly all the time!

(Only setup where one could maybe find situations where a network with reticulations wins is if we have multiple partitions, I am trying to find a counterexample now).

As a reminder, this is LikelihoodType.BEST: Screenshot from 2020-12-01 13-27-08

celinescornavacca commented 3 years ago

The best can vary among the partitions of the MSA

celinescornavacca commented 3 years ago

You choose the best tree per each partition.

celinescornavacca commented 3 years ago

I did not want to close it.

lutteropp commented 3 years ago

Yes, but it gets downscaled by the displayed tree probability. This is the part that worries me.

lutteropp commented 3 years ago

I have the intuition that it would make more sense to not downscale the partition-loglikelihood by the probability of the best displayed tree it chose...

celinescornavacca commented 3 years ago

If it does, then there is not need for a network wrt to this score. I agree that the probs have to be high (e.g. 0.5) and the gain in likelihood too to have a network

lutteropp commented 3 years ago

Especially, if we do not have number_of_partitions = number_of_displayed_trees. I expect that in a real-world dataset with unknown number of reticulation events, number_of_partitions can be much smaller than the number of displayed trees in what would be the "true" network.

celinescornavacca commented 3 years ago

No, it does not make sense not to consider the inheritance probs in my opinion.

lutteropp commented 3 years ago

Yeah, probably people had a reason to put the inheritance probs into this definition...

But about the partitions issue:

If we have only 1 partition in the MSA, then the BIC will always favor zero reticulations.
If we have less partitions in the MSA than 2^(number_of_reticulations_in_the_unknown_true_network), then the BIC will at least underestimate the number of reticulations.
If we have many partitions in the MSA, then it comes down to the amount of likelihood increase and the probability of displayed trees. Although even there I am not convinced yet that we can have a situation where BIC would choose a network that has some reticulations in it. I am currently trying to play with some numbers to fabricate a counterexample somehow (using some assumption like displayed tree loglikelihoods differ by not more than a factor of 2)...

lutteropp commented 3 years ago

We also have to distinguish between the case of unlinked mode (each partition has its own branch lengths and reticulation probs) and linked mode (all partitions share the same branch lengths and reticulation probs).

lutteropp commented 3 years ago

Okay... I tried coming up with a counterexample, but failed. Starting to believe this definition now.

LikelihoodType.BEST, 1 reticulation (i.e., 2 displayed trees), 2 partitions in the MSA:

displayed_tree_1 probability on partition_1: 50%
displayed_tree_2 probability on partition_1: 50%
displayed_tree_1 loglikelihood on partition_1: -100
displayed_tree_2 loglikelihood on partition_1: -150
displayed_tree_1 probability on partition_2: 50%
displayed_tree_2 probability on partition_1: 50%
displayed_tree_1 loglikelihood on partition_2: -120
displayed_tree_2 loglikelihood on partition_1: -80

partition_1 chooses displayed_tree_1 --->partition_1 loglikelihood: 50% -100 = -50 partition_2 chooses displayed_tree_2 ---> partition_2 loglikelihood: 50% -80 = -40 total network loglikelihood: -50 + -40 = -90

total tree loglikelihood if we would have only used displayed_tree_1: -100 + -120 = -220 total tree loglikelihood if we would have only used displayed_tree_2: -150 + -80 = -230

lutteropp commented 3 years ago

In case of all partitions of the network choosing the same displayed tree, the loglikelihood value we get is not comparable to the loglikelihood value we would get with standard phylogenetic tree loglikelihood though:

network loglikelihood with all partitions taking displayed tree 1: 50% -100 + 50% -120 = -50 + -60 = -110 standard phylogenetic tree likelihood on displayed tree 1: -100 + -120 = -220

stamatak commented 3 years ago

I guess this makes sense as we do have an additional parameter which is the reticulation probability, so with one additional free parameter the likelihood is indeed expected to be better ...

On 01.12.20 15:37, Sarah Lutteropp wrote:

In case of all partitions of the network choosing the same displayed tree, the loglikelihood value we get is not comparable to the loglikelihood value we would get in standard phylogenetic tree loglikelihood though:

with all partitions taking displayed tree 1: 50% -100 + 50% -120 = -50 + -60 = -110 with standard phylogenetic tree likelihood on displayed tree 1: -100 + -120 = -220

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/20#issuecomment-736556273, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6UZ6IWCO54YVYN5OJTSSTWSXANCNFSM4UI7OIJQ.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

lutteropp / NetRAX

Does LikelihoodType.BEST make any sense at all, if we use BIC to score networks? #20