matsen / pplacer

Phylogenetic placement and downstream analysis
http://matsen.fredhutch.org/pplacer/
GNU General Public License v3.0
74 stars 18 forks source link

clarify parsing error messages #212

Closed koadman closed 12 years ago

koadman commented 12 years ago

rppr pdprune is unable to read this tree, which is the greengenes 16s tree reformatted to remove egregious characters:

http://edhar.genomecenter.ucdavis.edu/~koadman/16s_noeol

I'm not sure whether this is a data formatting problem or a bug in rppr, but other tree software seems to be able to parse the newick so I thought I should report it as a possible issue. rppr parses most of the tree, but reports an error when reading the root node. Attempts to name the root node and give it a branch length and terminate the line with a ; had no effect.

habnabit commented 12 years ago

This isn't occurring when reading the root node; I'm not sure why you think that. The error I get is "syntax error parsing between 1:6431745 and 1:6431746". Characters 6431700-6431775 on line 1 are:

'704:0.00111)100__s__Synechococcus_sp_JA-2-3Ba(2-13):0.00896,(((((139842:0.0'

The error specifically is the use of unquoted parentheses in a node label.

koadman commented 12 years ago

Thanks, that's good news. Would it be possible to change the error message slightly to indicate those numbers are line and character ranges? I had misinterpreted them as node numbers and somehow thought that the numbers referred to nodes flanking the root. Not sure what led me to that guess.

habnabit commented 12 years ago

Okay, so I'm changing this issue to clarify the error message. It would also be nice if there was a way to indicate the file being parsed, but that becomes more difficult.