Open dridk opened 7 years ago
Hi Sacha,
Unfortunately what you try to do is not currently possible with the newick utilities. The reason is that the current behaviour is not to keep nodes with a single child, which is what would happen when you remove the first leaf of a two-leaf node.
While this behaviour makes sense in a lot of situations, I see that in your case it would be better to just remove the leaves. I will see if I can add this functionality, but I can make no guarantee, as I have lots of other projects that also need my attention.
Cheers,
Thomas
Hi,
I m working on the greengene tree avaible here : gg_13_5_otus_99_annotated.tree.gz http://greengenes.secondgenome.com/downloads/database/13_5
This file contains a tree based on 16S RNA from bacteria. I would like to extract a simple relation , for example the tree of : g__Staphylococcus, g_Streptococcus, g_Enterococcus. Unfortunally, for each species , there are many leaves labeled with number. Which probably correspond to the sequence ID.
This command print all nodes except leaves. I get all my taxonomy
This command print only leaves. I get unwanted number list .
So, I don't know how to remove all leaves and keep only a tree of taxon name. I m sure it's possible with newick tools, but didn't find any way . Could you give me some clues ?