etetoolkit / ete

Python package for building, comparing, annotating, manipulating and visualising trees. It provides a comprehensive API and a collection of command line tools, including utilities to work with the NCBI taxonomy tree.
http://etetoolkit.org
GNU General Public License v3.0
792 stars 214 forks source link

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 11: ordinal not in range(128) #380

Closed GiantSpaceRobot closed 6 years ago

GiantSpaceRobot commented 6 years ago

Hi there,

When I run this code (Python 2.7)

from ete3 import NCBITaxa ncbi = NCBITaxa()

I get this error

NCBI database not present yet (first time used?) Downloading taxdump.tar.gz from NCBI FTP site (via HTTP)... Done. Parsing... Loading node names... Traceback (most recent call last): File "NewScript.py", line 2, in ncbi = NCBITaxa() File "/home/user/.local/lib/python2.7/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 110, in init self.update_taxonomy_database(taxdump_file) File "/home/user/.local/lib/python2.7/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 129, in update_taxonomy_database update_db(self.dbfile) File "/home/user/.local/lib/python2.7/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 745, in update_db t, synonyms = load_ncbi_tree_from_dump(tar) File "/home/user/.local/lib/python2.7/site-packages/ete3/ncbi_taxonomy/ncbiquery.py", line 670, in load_ncbi_tree_from_dump line = str(line.decode()) UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 11: ordinal not in range(128)

The taxdump.tar.gz file is succesfully downloaded before the Python script crashes due to the UnicodeDecodeError. I'm not why this issue is occurring, as I have run this code on two separate machines (up-to-date ete3 versions), one throwing this error and one not.

Thanks, Paul

GiantSpaceRobot commented 6 years ago

This problem magically disappeared.