simonhmartin / twisst

Topology weighting by iterative sampling of sub-trees
GNU General Public License v3.0
72 stars 19 forks source link

AssertionError: Named samples not present in tree. #4

Open Burciny opened 6 years ago

Burciny commented 6 years ago

Hello, I am trying to run the Twisst with my set of gene trees (3963), created with IQTree (newick format). I have created these trees with 84 individuals belonging to 8 different groups. And for some of the genes, the individuals with high missing data (NNs) were excluded before creating the trees. That's why not every gene tree includes all of the individuals. Additionally there is no consensus that can specify just one or two specific individuals were excluded in the trees, this is changing among the trees. I know in the tutorial it is written that "All trees must contain all specified individual names as tip labels” However, i have run the twisst anyway and i am getting the weighting and topology outputs (weighting file without values, just topologies). And, in the end of the report file it says ;

Traceback (most recent call last): File "/home/hpc/pr62ba/di49pey3/apps/phylogenomics/twisst/bin/twisst.py", line 735, in assert namesSet.issubset(leafNamesSet), "Named samples not present in tree." AssertionError: Named samples not present in tree.

So, in this case, is there a way that can fix this, or twisst can not used for my data ?

The code i have run is like following ;

twisst.py -t [All_gene_trees] -w weightings_output --outputTopos topologies_output -g [species_1] -g [species_2] -g [species_3] -g [species_4] -g [species_5] -g [species_6] -g [species_7] -g [species_8] --outgroup [outgroup_species_name] --groupsFile [grouping] --method complete

Best, Burçin

simonhmartin commented 6 years ago

Hello Burçin, I'm surprised you're getting a full output. Are you sure the output is not truncated (i.e. it stops just before the first tree that had missing individuals?) I do plan to eventually add functionality to be able to deal with the situation of missing individuals in certain trees, but it will require considerable modification and testing of the code. Best, Simon