adamallo / SimPhy

SimPhy: A comprehensive simulator of gene family evolution
GNU General Public License v2.0
26 stars 1 forks source link

Species tree node labelling #7

Open mciach opened 6 years ago

mciach commented 6 years ago

Hi,

It seems that, after setting the OL parameter to 1, the internal nodes of the species tree in s_tree.trees file are numbered in inorder, while the same nodes in .mapsl file are numbered in postorder. This makes it a bit cumbersome to analyze the results.

Also, since the leaves are labeled independently, there can be two nodes with the same number. Encasing the leaf node labels in apostrophes in the .mapsl file doesn't really help much, because the labels still look almost the same. This makes it easy to confuse an internal node with a leaf, especially when analyzing the mapping manually. I'd suggest making a more clear distinction, e.g. labeling the leafs as S0, S1, S2 etc.

Cheers! Michał

damiendevienne commented 1 year ago

Hi, It's been a long time, but I am facing the same kind of issue. Below are a species tree (left) and a locus tree (right) with the nodes as labeled by simphy (-ol 1). The locus tree is identical to the species tree (no transfer, or anything else). The mapping file (*.mapsl) tells that node ids are identical in the two trees (columns Lt_node and St_node are identical in this case). However, it is clearly not the case.

image

This makes it impossible for me to really look at other scenarios that I am intersted in. Any reason why this is happening, or if I am doing something wrong?

Thanks a lot for any help, Best, Damien

adamallo commented 1 year ago

Hi,

It seems that, after setting the OL parameter to 1, the internal nodes of the species tree in s_tree.trees file are numbered in inorder, while the same nodes in .mapsl file are numbered in postorder. This makes it a bit cumbersome to analyze the results.

Also, since the leaves are labeled independently, there can be two nodes with the same number. Encasing the leaf node labels in apostrophes in the .mapsl file doesn't really help much, because the labels still look almost the same. This makes it easy to confuse an internal node with a leaf, especially when analyzing the mapping manually. I'd suggest making a more clear distinction, e.g. labeling the leafs as S0, S1, S2 etc.

Cheers! Michał

Dear Michał, I agree that the mapping output is not ideal and should be improved. See my answer to Damien below on the node ordering issue.

adamallo commented 1 year ago

Hi, It's been a long time, but I am facing the same issue. Below are a species tree (left) and a locus tree (right) with the nodes as labeled by simphy (-ol 1). The locus tree is identical to the species tree (no transfer, or anything else). The mapping file (*.mapsl) tells that node ids are identical in the two trees (columns Lt_node and St_node are identical in this case). However, it is clearly not the case.

image

This makes it impossible for me to really look at other scenarios that I am intersted in. Any reason why this is happening, or if I am doing something wrong?

Thanks a lot for any help, Best, Damien

Dear Damien,

I discovered the bug that labels internal species tree nodes in pre-order while the mappings, locus, and gene trees are labeled in post-order some years back while answering another user's questions https://groups.google.com/g/simphy/c/uD09TTdC4PU .

I solved this issue in the postorderSptree development branch but never merged it with the master because I wanted to do more thorough testing first. Initial tests showed that the code was working as expected, however. I recommend compiling the sources in this developmental branch and using them for your research.

Let me know if you have problems compiling the code; I can do it for you. I will try to merge these changes to the master soon, but I do not know when I will be done.

I hope this helps,

DM

damiendevienne commented 1 year ago

Ok, thanks a lot for these advice! I'll try the development branch. Best, Damien