Closed bburkhar1 closed 1 year ago
Thank you for reaching out! Would you be able to attach the file it failed to load (or if it contains sensitive information, a similar example file that also triggers the issue)?
Yes, here is an example. The first file is the initial Newick Tree output from MAFFT and it has the error (fail). The second file is the same except that I have removed the returns and it runs successfully. Newick_tree_fail.txt Newick_tree_success.txt
Thanks! I think I've patched it in the most recent version of TreeSwift (the Python package that TreeCluster uses to parse/traverse trees). Can you try updating to TreeSwift v1.1.35 and see if that fixes the issue? E.g.
pip3 install --upgrade treeswift
I think that should have fixed it, so I'll go ahead and close this GitHub Issue, but please feel free to reopen it if the issue persists even after updating TreeSwift
Hi,
Thanks for the quick update. I updated TreeSwift to v1.1.35, but it gave another error (with the same file I shared previously) at a different line. See traceback below:
$ TreeCluster.py -i Newick_tree_fail.txt -o Tree_test.txt -t 0.7 -m max_clade Traceback (most recent call last): File "/home/bburkhar/.local/lib/python3.10/site-packages/treeswift/Tree.py", line 1454, in read_tree_newick while parse_label or ts[i] not in {':', ',', ';', ')', '['}: IndexError: string index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/TreeCluster.py", line 605, in
Woops, thank you for catching that! I had incorrectly assumed that TreeCluster was just directly using TreeSwift's read_tree_newick
function for reading the tree; I should have tested it more thoroughly 😄 I just patched TreeCluster's handling of reading Newick trees to use the aforementioned patch I implemented in TreeSwift. Thus, this issue should be fixed with TreeSwift v1.1.35 and TreeCluster v1.0.4:
pip3 install --upgrade treecluster treeswift
Try it out and let me know if it works for you now 😄
It is working now. Thanks for the quick responses and clear instructions.!
No problem at all! Thank you for reporting this bug!
Greetings,
I'm using MAFFT to output Newick Tree files for my sequences, but MAFFT seems to output the Newick files with multiple new lines / carriage returns and TreeCluster seems to choke on it (see traceback below). My only work around is to open the raw Newick output and use find and replace to re-organize the file so everything is on one line (rather than many lines by deleting the returns). I don't know if this was intended but it seems like it would be better a beginner (like me) if one could go straight from the MAFFT tree output into TreeCluster unless I'm making a mistake somewhere in TreeCluster (or MAFFT which I realize you don't support... but it is widely used in the field so I'd accept any suggestions). Thanks!
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/treeswift/Tree.py", line 1443, in read_tree_newick while parse_label or ts[i] not in {':', ',', ';', ')', '['}: IndexError: string index out of range
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/bin/TreeCluster.py", line 605, in
trees.append(read_tree_newick(l))
File "/usr/local/lib/python3.10/dist-packages/treeswift/Tree.py", line 1452, in read_tree_newick
raise RuntimeError(f"Failed to parse string as Newick: {ts}")
RuntimeError: Failed to parse string as Newick: 1_Sequence_0001