Closed yasuiniko closed 8 years ago
The purpose of a TreeList is not so much efficiency, but (primarily) to manage a collection of trees that reference the same TaxonNamespace. So, by design, any trees added to a TreeList will have their taxa remapped into the TaxonNamespace of the TreeList.
The problem with your code is that while you indeed do carefully manage the taxon namespace of your generated trees, you throw all of that away when you do not ensure that your TreeList instance references that same TaxonNamespace.
So, change your code from:
sp_trees = dp.TreeList(map(lambda x: bd_tree(birth_rate=1.0,
death_rate=0,
taxon_namespace=species),
range(n_sp_trees)))
to
sp_trees = dp.TreeList(map(lambda x: bd_tree(birth_rate=1.0,
death_rate=0,
taxon_namespace=species),
range(n_sp_trees)),
taxon_namespace=species) ## here we
explicitly give the TreeList instance the correct TaxonNamespace object
On an entirely different note, is there any reason you are using maps instead of simple loops here? Perhaps it is just my functional-programming-crippled mind, but IMHO, for what you are trying to do, there is not anything gained, and the logic is obscured and the denseness of the constructs makes it easy for simple errors to creep in! ;) Of course, I totally understand if this flow seems more natural to you .... different strokes for different folks and all!
On 7/5/16 11:03 PM, Niko Yasui wrote:
Bug Description Hi @jeetsukumaran https://github.com/jeetsukumaran,
I'm trying to make a bunch of contained coalescent trees, and when I use a TreeList to hold my species trees I get a weird error (traceback below). If I use a regular list there's no problem. I don't think my use case is that strange, since it seems like we're meant to use TreeLists as an efficient list of trees when their taxon namespaces are all the same.
Sorry for butchering all your function names in the demo code.
Best, Niko
P.S. I figured out how to properly format code in Markdown, so no more code images! Yay!
Traceback screen shot 2016-07-06 at 11 30 10 https://cloud.githubusercontent.com/assets/7184841/16605892/31716018-436e-11e6-8b99-f73420f025f2.png
System Info OS Version: OSX 10.11.5 (15F34) DendroPy version : DendroPy 4.1.0 DendroPy location : /Users/nikoyasui/anaconda3/lib/python3.5/site-packages/dendropy Python version : 3.5.1 |Anaconda custom (x86_64)| (default, Dec 7 2015, 11:24:55) [GCC 4.2.1 (Apple Inc. build 5577)] Python executable : /Users/nikoyasui/anaconda3/bin/python3 Python site packages : ['/Users/nikoyasui/anaconda3/lib/python3.5/site-packages']
Demo Code
import dendropy as dp import string
taxa_map = dp.TaxonNamespaceMapping.create_contained_taxon_mapping bd_tree = dp.simulate.birth_death_tree cc_tree = dp.simulate.treesim.contained_coalescent_tree
c = 1 n_sp_trees = 2 n_gene_trees = 1000 n_sp = 5 n_ind = 2 # can be int or list
Ne = 10000 sp_depth = Ne*c
species = dp.TaxonNamespace(string.ascii_uppercase[:n_sp])
if "Creating a list is OK": sp_trees = list(map(lambda x: bd_tree(birth_rate=1.0, death_rate=0, taxon_namespace=species), range(n_sp_trees)))
elif "Creating a TreeList breaks everything": sp_trees = dp.TreeList(map(lambda x: bd_tree(birth_rate=1.0, death_rate=0, taxon_namespace=species), range(n_sp_trees)))
elif "Creating a TreeList from a list also breaks everything": sp_trees = dp.TreeList(list(map(lambda x: bd_tree(birth_rate=1.0, death_rate=0, taxon_namespace=species), range(n_sp_trees))))
for tree in sp_trees: for edge in tree.postorder_edge_iter(): setattr(edge, 'pop_size', Ne)
si_map = taxa_map(containing_taxon_namespace=species, num_contained=n_ind)
make_ctrees = lambda tree: list(map(lambda x: cc_tree(tree, si_map), range(n_gene_trees)))
gene_trees = list(map(make_ctrees, sp_trees))
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jeetsukumaran/DendroPy/issues/61, or mute the thread https://github.com/notifications/unsubscribe/AABmR_rkUabvU31sPyjHLsDXxwwwuT8Zks5qSxrxgaJpZM4JFtwB.
Blog/Personal Pages: http://jeetworks.org/ GitHub Repositories: http://github.com/jeetsukumaran Photographs (as stream): http://www.flickr.com/photos/jeetsukumaran/ Photographs (by galleries):
@jeetsukumaran,
Thanks so much for the help! I must have misread the docs; I thought TreeList just grabbed the taxon_namespace of the first tree it saw.
As for the functional programming, it's easier for me to hold the code in my head using maps. Even though for loops might be faster to read, they take longer for to understand and are more complex than "write a function, apply it to a data structure". I also hear rumors that they are faster than list comprehensions in the latest python...
As for the denseness of the structure, I find that if you throw away your notion of how much time you should be spending staring at X lines of code, and instead think in terms of how much time you should spend thinking about how to accomplish task X, 'dense' becomes 'efficient'.
[Edit: added code blocks] Bug Description Hi @jeetsukumaran,
I'm trying to make a bunch of contained coalescent trees, and when I use a TreeList to hold my species trees I get a weird error (traceback below). If I use a regular list there's no problem. I don't think my use case is that strange, since it seems like we're meant to use TreeLists as an efficient list of trees when their taxon namespaces are all the same.
Sorry for butchering all your function names in the demo code.
Best, Niko
Traceback
Demo Code
System Info OS Version: OSX 10.11.5 (15F34) DendroPy version : DendroPy 4.1.0 DendroPy location : /Users/nikoyasui/anaconda3/lib/python3.5/site-packages/dendropy Python version : 3.5.1 |Anaconda custom (x86_64)| (default, Dec 7 2015, 11:24:55) [GCC 4.2.1 (Apple Inc. build 5577)] Python executable : /Users/nikoyasui/anaconda3/bin/python3 Python site packages : ['/Users/nikoyasui/anaconda3/lib/python3.5/site-packages']