Python package for building, comparing, annotating, manipulating and visualising trees. It provides a comprehensive API and a collection of command line tools, including utilities to work with the NCBI taxonomy tree.
This situation applies in an HPC environment and there are multiple concurrent jobs running which are calling ETE3 independently and there has either been an update to the NCBI database or momentary connection interruption which prevents connection to the SQL lite NCBI taxonomy database. Essentially all of the processes then try to update the NCBI taxonomy database simultaneously which then causes them all to start failing. The problem is in the file ncbiquery.py.
self.db = None
self._connect()
if not is_taxadb_up_to_date(self.dbfile):
print('NCBI database format is outdated. Upgrading', file=sys.stderr)
self.update_taxonomy_database(taxdump_file)
It would be great to have as an option to ignore updating the NCBI taxonomy and/or having the process create a lock file for updating the taxonomy database, so that multiple processes can't try to do it simultaneously.
This situation applies in an HPC environment and there are multiple concurrent jobs running which are calling ETE3 independently and there has either been an update to the NCBI database or momentary connection interruption which prevents connection to the SQL lite NCBI taxonomy database. Essentially all of the processes then try to update the NCBI taxonomy database simultaneously which then causes them all to start failing. The problem is in the file ncbiquery.py.
It would be great to have as an option to ignore updating the NCBI taxonomy and/or having the process create a lock file for updating the taxonomy database, so that multiple processes can't try to do it simultaneously.