lutteropp / NetRAX

Phylogenetic Network Inference without ILS
GNU General Public License v3.0
17 stars 1 forks source link

I don't understand those Dendroscope network distances #13

Open lutteropp opened 3 years ago

lutteropp commented 3 years ago

Why do trees with (unrooted) RF-distance of zero have non-zero Dendroscope distances? Does it have to do with the rooting somehow?

See screenshot from a results.csv file, used (in different settings) on a 4 taxon simulated tree: Screenshot from 2020-11-29 22-36-35

I also attached the entire CSV file small_tree_results.csv.txt

In general, I still need to figure out how to interpret those topological distances computed by Dendroscope. Like... which number is good? What's the theoretical maximum distance? Can we convert them into relative values somehow?

stamatak commented 3 years ago

I guess Celine is the expert for this.

On 29.11.20 23:41, Sarah Lutteropp wrote:

Why do trees with (unrooted) RF-distance of zero have non-zero Dendroscope distances? Does it have to do with the rooting somehow?

See screenshot from a results.csv file, used (in different settings) on a 4 taxon simulated tree: Screenshot from 2020-11-29 22-36-35 https://user-images.githubusercontent.com/1059869/100554207-6a4dd880-3293-11eb-8cfa-83963f1bdf78.png

I also attached the entire CSV file small_tree_results.csv.txt https://github.com/lutteropp/NetRAX/files/5612883/small_tree_results.csv.txt

Also in general, I still need to figure out how to interpret those topological distances computed by Dendroscope. Like... which number is good? What's the theoretical maximum distance? Can we convert them into relative values somehow?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/13, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6UGBMYOIW64Z7SOADLSSK5YBANCNFSM4UGZCY6Q.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

celinescornavacca commented 3 years ago

why do trees with (unrooted) RF-distance of zero have non-zero Dendroscope distances?

Because dendroscope works with rooted trees,so everything is defined on rooted trees (rooted RF-distances are defined on clusters instead of bipartitions)

Like... which number is good?

The distances describe different things, I attach a chapter from the book I wrote with Daniel and Regula describing them. The hardwired distance is less interesting for us, the softwired one the easier to interpret, but please read the chapter and share you opinion too.

comparing.pdf

What's the theoretical maximum distance?

I do not think it exists a result for general networks, maybe something for some restrained topological classes of networks. But it will of no use here.

Can we convert them into relative values somehow?

No, I do not think so, see above.

lutteropp commented 3 years ago

Thanks! Is it the book "Phylogenetic Networks: Concepts, Algorithms and Applications"?

I only got to read the PDF now, and I don't know what a hardwired vs. a softwired cluster is. Or a cluster... Hoping to find the definitions in another chapter of the book.

lutteropp commented 3 years ago

I found the book online, trying to speed-read relevant-looking parts of it.

lutteropp commented 3 years ago

[Copy from Slack message, to have this here as well]

I have just figured out that we can easily plot relative distance versions (in range [0.0, 1.0]) of all topological network distances. When looking at the definitions in @celines network book, they all are of the form: (|symmetric difference between A and B|) divided by 2. -> We just need to change them to be (|symmetric difference between A and B|) divided by (|A union B|) and there we go. Then, we will get relative distances. These will make nicer plots.

lutteropp commented 3 years ago

@celinescornavacca Does this approach make sense? I am assuming that we can have two networks (on the same set of taxa) which have zero clusters in common.

lutteropp commented 3 years ago

Of course, the trivial clusters will always be in common. Which means we will never get a distance score of 1.0 by applying this trick. Is this a problem?

lutteropp commented 3 years ago

I believe it is not a problem, because with relative RF-distance it is kinda the same issue...

lutteropp commented 3 years ago

Do we need to exclude the trivial bipartitions/clusters when computing the network distances? I tried finding the definition for relative RF distance to check how it is done there, but I only found the definitions for absolute RF distance online... :-/

stamatak commented 3 years ago

standard relative RF distance between trees only operates on the non-trivial bipartitions, so it can reach values of 0 and 1

On 16.02.21 01:25, Sarah Lutteropp wrote:

Do we need to exclude the trivial bipartitions/clusters when computing the network distances? I tried finding the definition for relative RF distance to check how it is done there, but I only found the definitions for absolute RF distance online... :-/

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/13#issuecomment-779491349, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6TDRXTXIQDG3T462DLS7GUNFANCNFSM4UGZCY6Q.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

lutteropp commented 3 years ago

Thanks @stamatak! So we need to diverge from the distance definitions in Celines network book: We will explicitly discard the trivial bipartitions/clusters/whatever in our own distance implementations.

stamatak commented 3 years ago

I guess so, but maybe wait for what Celine's opinion on this is.

On 16.02.21 11:13, Sarah Lutteropp wrote:

Thanks @stamatak https://github.com/stamatak! So we need to diverge from the distance definitions in Celines network book: We will explicitly discard the trivial bipartitions/clusters/whatever in our own distance implementations.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lutteropp/NetRAX/issues/13#issuecomment-779695507, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGXB6TII6P4HBZPYDYIJ43S7IZKXANCNFSM4UGZCY6Q.

-- Alexandros (Alexis) Stamatakis

Research Group Leader, Heidelberg Institute for Theoretical Studies Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology

www.exelixis-lab.org

celinescornavacca commented 3 years ago

Totally ok for discarding trivial bipartitions and use the denominator you suggested and not 2 (there are two versions of the RF, I actually like the non-2 more).

On February 16, 2021 10:22:08 AM GMT+01:00, Alexis Stamatakis notifications@github.com wrote:

I guess so, but maybe wait for what Celine's opinion on this is.>

On 16.02.21 11:13, Sarah Lutteropp wrote:> Thanks @stamatak https://github.com/stamatak! So we need to diverge

from the distance definitions in Celines network book: We will > explicitly discard the trivial bipartitions/clusters/whatever in our own > distance implementations.>

—> You are receiving this because you were mentioned.> Reply to this email directly, view it on GitHub >

https://github.com/lutteropp/NetRAX/issues/13#issuecomment-779695507,

or unsubscribe >

https://github.com/notifications/unsubscribe-auth/AAGXB6TII6P4HBZPYDYIJ43S7IZKXANCNFSM4UGZCY6Q.>

-- > Alexandros (Alexis) Stamatakis>

Research Group Leader, Heidelberg Institute for Theoretical Studies> Full Professor, Dept. of Informatics, Karlsruhe Institute of Technology>

www.exelixis-lab.org>

-- > You are receiving this because you were mentioned.> Reply to this email directly or view it on GitHub:> https://github.com/lutteropp/NetRAX/issues/13#issuecomment-779700451

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.