the net1 show the wrong topology

Niloofar-Alaei commented 3 years ago

Hi

I am not perfectly familiar with SNaQ, and I run it for the first time for my data with 6 taxa (5 ingroups and one outgroup).

I have one question about comparing the trees; when I compare the astral tree and net0, there are no differences between them, as I expected. The part that I can not understand is that net1 shows the different topology that is not true. How can I fix it?

Would you please let me know when the network's topology changed and why?

Thanks a lot Niloo

cecileane commented 3 years ago

Remember that the network is unrooted, so you should re-root your net1 using your outgroup. Would that fix the issue perhaps?

Another thing to remember is that the plotting function draws the "major" edge (with γ>0.5) is a tree edge, and draws the minor edge (with γ<0.5, in lighter blue) as an "extra" or "gene flow" edge. If you imagined deleting the "major" edge and following the minor (light blue) edge only, would you recover relationships that you think it true? In that case, your net1 would indeed recover the true relationships, but with estimated inheritance proportions less than 50%.

Niloofar-Alaei commented 3 years ago

Yes, I rooted both net0 and net1 and then compare them. but it doesn't help and it's still the differences between them.

Its still not completely clear to me, can I send you the tree and then discuss that shortly?

cecileane commented 3 years ago

yes! "offline" if you prefer.

Niloofar-Alaei commented 3 years ago

Thanks a lot

The net0 that is in agreement with ASTRAL. This topology has been approved by the different data sets. We also know that introgression happends between M and P or P+B Then based on the net1: Clade B+P is 57% sister taxa to H and 42% sister with M. Am I right? Then how can we explain it? and also look at the number of node (H,(B,P)), its 9 , why its not negative and why not 8? NET0_ASTRAL NET1

Niloofar-Alaei commented 3 years ago

Thanks a lot for your help

cecileane commented 3 years ago

I see. Are the pseudo-likelihood scores that much different between net1 and a network that would have your tree displayed within it? You could try a few candidate networks that you expect are "correct", and calculate their scores (see here) and then see if their scores are similarly better as the score of net1, compared to the score of the tree.

I can think of these candidate networks, below. The first one has gene flow from M to (P,B) like in net1. The second has gene flow from (P,B) to M, the third from P to M, and the last one from M to P.

net1 = readTopology("(D,(H,((C,(M,#H1)),((P,B))#H1)));") # M -> (P,B)
net2 = readTopology("(D,(H,((C,(M)#H1),(#H1,(P,B)))));") # (P,B) -> M
net3 = readTopology("(D,(H,((C,(M)#H1),((#H1,P),B))));") # P -> M
net4 = readTopology("(D,(H,((C,(M,#H1)),((P)#H1,B))));") # M -> P

It's hard to judge the difference between pseudo-likelihood scores, because it's a pseudo likelihood, not a likelihood: the theory for AIC, BIC etc. does not hold. But a score of 0 means perfect fit in SNaQ. The idea of the "slope heuristic" is to see how much of a score decrease we get from as we add reticulations, so you can use the score of your tree as a baseline.

cecileane commented 3 years ago

About the node numbers: they are completely arbitrary. The function that parses networks in parenthetical format assigns positive numbers to tips and hybrid nodes (because hybrid nodes are technically represented by 2 tips in parenthetical format) and assigns negative numbers to other internal nodes. These numbers can change during SNaQ, when edges are disconnected and reconnected, hybrid nodes are added/deleted, etc. Node numbers and edge numbers are really arbitrary.

cecileane commented 3 years ago

also, on the more theoretical side: several studies showed that reticulation can cause tree methods (which ignore reticulation) to infer the wrong "major" tree. For example: Solís-Lemus et al. 2016 or Long & Kubatko 2018. The problem affects concatenation as well as coalescent methods like ASTRAL.

Niloofar-Alaei commented 3 years ago

Thanks a lot for the response and explanation.

The -Ploglik of net1 was 3.3293262823921626 and for net0 which shows the correct topology was 21.886906875868718.

Now I try the candidate network and check the pseudolikelihood.

Thanks for your help

From: Cécile Ané @.***> Sent: 25 May 2021 00:06:58 To: cecileane/PhyloPlots.jl Cc: Niloofar Alaei Kakhki; Author Subject: Re: [cecileane/PhyloPlots.jl] the net1 show the wrong topology (#15)

also, on the more theoretical side: several studies showed that reticulation can cause tree methods (which ignore reticulation) to infer the wrong "major" tree. For example: Solís-Lemus et al. 2016https://doi.org/10.1093/sysbio/syw030 or Long & Kubatko 2018https://doi.org/10.1093/sysbio/syy020. The problem affects concatenation as well as coalescent methods like ASTRAL.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/cecileane/PhyloPlots.jl/issues/15#issuecomment-847380565, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANPOBGGZEFW7UYMCLAQTJL3TPLEYFANCNFSM45JHCHVA.

cecileane commented 3 years ago

closed this issue, because it is about network inference & interpretation (not PhyloPlots per se).

JuliaPhylo / PhyloPlots.jl

the net1 show the wrong topology #15