fhcrc / taxtastic

Create and maintain phylogenetic "reference packages" of biological sequences.
GNU General Public License v3.0
21 stars 10 forks source link

raxml regex too permissive #130

Closed jgolob closed 5 years ago

jgolob commented 5 years ago

The regex try_set_fields(result, r'(?P<datatype>DNA|RNA|AA)', s)) used when parsing an raxml tree stats file is too permissive, and can often collide with sequences included in the stats file, resulting in a tree being read as being an amino acid tree (AA) rather than correctly a DNA tree. This causes taxtastic to fail to properly parse the stats file and fail to extract the base empiric base frequencies