Closed valevalorin closed 1 year ago
Can you provide an example file (ideally as small as possible) that I could use to test any potential fixes?
String:
(https://youtube.com,&)root
|| ('https://youtube.com','&')root
|| ("https://youtube.com","&")root
Code:
from treeswift import Node, read_tree_newick
root = Node('root')
url = Node('https://youtube.com')
amp = Node('&')
root.add_child(url)
root.add_child(amp)
t = read_tree_newick(root.newick())
Edit: Specific Error
ValueError Traceback (most recent call last)
~/anaconda3/envs/tensorflow2_p38/lib/python3.8/site-packages/treeswift/Tree.py in read_tree_newick(newick)
1444 ls += ts[i]; i += 1
-> 1445 n.edge_length = float(ls); i -= 1; parse_length = False
1446
ValueError: could not convert string to float: '//youtube.com'
Thanks for sharing! I don't think your original string is a valid Newick string, though: even with reserved characters being handled via escaping, Newick strings should end with a semicolon. The TreeSwift.Node.newick
function is not meant to be called directly and is instead meant to be called recursively to generate the partial Newick string for a given subtree (so that TreeSwift.Tree.newick
can just call root.newick()
and then add a semicolon and some other stuff). Here's a slightly updated code example:
from treeswift import Tree, Node
tree = Tree()
url = Node('https://youtube.com')
amp = Node('&')
tree.root.add_child(url)
tree.root.add_child(amp)
print(tree.newick())
Which outputs a Newick string that is "valid" if reserved symbols are allowed to be used if they're escaped:
[&R] (https://youtube.com,&);
This Newick string is indeed currently unsupported:
from treeswift import read_tree_newick
read_tree_newick("[&R] (https://youtube.com,&);")
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/treeswift/Tree.py", line 1445, in read_tree_newick
n.edge_length = float(ls); i -= 1; parse_length = False
ValueError: could not convert string to float: '//youtube.com'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/treeswift/Tree.py", line 1455, in read_tree_newick
raise RuntimeError("Failed to parse string as Newick: %s"%ts)
RuntimeError: Failed to parse string as Newick: (https://youtube.com,&);
I'll try to add support for escaped reserved symbols as soon as possible
Your code sample is really interesting to me because it does something I couldn't find documentation of - how to make a tree manually (i.e. without reading it from a newick string). The process wasn't obvious, and I couldn't find an example, thus why I'm only creating nodes and getting the newick from there.
I just double checked the Cookbook and I don't see an example like the above that shows the manual process. It might be a good idea to add it to the Cookbook as well.
Regarding the code example, great idea! I've just added such a code sample to the Cookbook, under the following:
"TreeSwift also supports manually creating a tree from scratch:"
Regarding the initial issue, I just pushed a new release of TreeSwift (v1.1.31) which should fix the issue. I only support '
(not "
) to escape reserved symbols just as in the link you posted, but if the standard ever changes to support both, I'll update accordingly. You can update your version of TreeSwift as follows (if you use pip
):
pip install --upgrade treeswift
It may take a couple of attempts for it to register that there's a new version
Thank you so much!
I have a tree containing labels with both ';' and ':' characters. I can write the tree to the newick format without issue, but if I try to parse the string it fails when it encounters the unescaped ';' and ';' characters. I don't think there is a standard way of escaping for the format, but this module uses
'
as the escape characters.