jeetsukumaran / DendroPy

A Python library for phylogenetic scripting, simulation, data processing and manipulation.
https://pypi.org/project/DendroPy/.
BSD 3-Clause "New" or "Revised" License
207 stars 62 forks source link

Unexpected unifurcation suppression in reroot_at_node() #118

Closed psathyrella closed 5 years ago

psathyrella commented 5 years ago

Unless I'm missing something, it appears that when reroot_at_node() is called on an unrooted tree, that suppress_unifurcations=False doesn't always have the desired effect. I think this isn't the expected behavior?

This should reproduce it:

#!/usr/bin/env python
import dendropy

print 'version: ', dendropy.__version__
unrooted_treestr = '(((l1,l2)n2)n1)root;'
rooted_treestr = '[&R] ' + unrooted_treestr

for tlabel, tstr in [('unrooted', unrooted_treestr), ('rooted', rooted_treestr)]:
    print '%s:' % tlabel
    dtree = dendropy.Tree.get_from_string(tstr, 'newick')

    print 'before reroot:'
    print dtree.as_ascii_plot(show_internal_node_labels=True, width=70)

    new_root_node = dtree.seed_node.child_nodes()[0]
    dtree.reroot_at_node(new_root_node, suppress_unifurcations=False)

    print 'after reroot:'
    print dtree.as_ascii_plot(show_internal_node_labels=True, width=70)
    print ''

and the output I get is:

version:  4.4.0
unrooted:
before reroot:
                                            /---------------------- l1
root------------------n1--------------------n2                        
                                            \---------------------- l2

after reroot:
/---------------------------------------------------------------- l1  
|                                                                     
n1--------------------------------------------------------------- l2  
|                                                                     
\---------------------------------------------------------------- root

rooted:
before reroot:
                                            /---------------------- l1
root------------------n1--------------------n2                        
                                            \---------------------- l2

after reroot:
                                /-------------------------------- l1  
/-------------------------------n2                                    
n1                              \-------------------------------- l2  
|                                                                     
\---------------------------------------------------------------- root

where n2 disappears on reroot in the first (unrooted) case, but not in the second (rooted) case.

jeetsukumaran commented 5 years ago

What's happening here is that because the first tree is unrooted, any basal bifurcation that results will be collapsed by default. Most of the lower level tree manipulation functions take a collapse_unrooted_basal_bifurcation argument to prevent this from happening (by setting this to False). Unfortunately, in higher-level functions such as reroot_at_node, I have not been diligent about allowing this parameter to be set to be passed down to the lower level functions.

I have just done so in the most recent revision here: 86d66160 .

Note that a basal bifurcation on an unrooted tree results in a splits hash collision --- the two child edges of the root note will have the exact same identity, and in the dictionary lookup only one of these will be stored. I would suggest sticking to rooted trees if the basal split is important to you, as on an unrooted tree this is all an illusion.

On 7/3/19 2:15 PM, Duncan Ralph wrote:

Unless I'm missing something, it appears that when reroot_at_node() is called on an unrooted tree, that |suppress_unifurcations=False| doesn't always have the desired effect. I think this isn't the expected behavior?

This should reproduce it:

|#!/usr/bin/env python import dendropy print 'version: ', dendropy.version unrooted_treestr = '(((l1,l2)n2)n1)root;' rooted_treestr = '[&R] ' + unrooted_treestr for tlabel, tstr in

'%s:' % tlabel dtree = dendropy.Tree.get_from_string(tstr, 'newick') print 'before reroot:' print dtree.as_ascii_plot(show_internal_node_labels=True, width=70) new_root_node = dtree.seed_node.child_nodes()[0] dtree.reroot_at_node(new_root_node, suppress_unifurcations=False) print 'after reroot:' print dtree.as_ascii_plot(show_internal_node_labels=True, width=70) print '' |

and the output I get is:

|version: 4.4.0 unrooted: before reroot: /---------------------- l1 root------------------n1--------------------n2 ---------------------- l2 after reroot: /---------------------------------------------------------------- l1 | n1--------------------------------------------------------------- l2 | ---------------------------------------------------------------- root rooted: before reroot: /---------------------- l1 root------------------n1--------------------n2 ---------------------- l2 after reroot: /-------------------------------- l1 /-------------------------------n2 n1 -------------------------------- l2 | ---------------------------------------------------------------- root |

where |n2| disappears on reroot in the first (unrooted) case, but not in the second (rooted) case.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jeetsukumaran/DendroPy/issues/118?email_source=notifications&email_token=AAAGMRYWGGX4DNNXZUHDQD3P5UJFVA5CNFSM4H5LAR5KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4G5HNHFA, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAGMRZEGN7XSUSATP4BVMLP5UJFVANCNFSM4H5LAR5A.

--


Jeet Sukumaran

Assistant Professor Biology Department San Diego State University

Lab: https://sukumaranlab.org/ Blog: https://jeetblogs.org/ Repositories: https://github.com/jeetsukumaran Photography: https://www.flickr.com/photos/jeetsukumaran/ Instagram: https://www.instagram.com/jeetsukumaran/ Calendar: https://goo.gl/dG5Axs

Email: jsukumaran@sdsu.edu (work) jeetsukumaran@gmail.com (personal)

Mailing Address: Biology Department, LS 262 San Diego State University 5500 Campanile Drive San Diego, CA 92182-4614

psathyrella commented 5 years ago

ok, great, thanks.

Also, I may have missed it, but I couldn't find the choices for the rooting keyword arg for Tree.get() by clicking around from here or here, although eventually found it in dataio/newickreader.py by grepping the source.