Closed nickp60 closed 1 year ago
Hello, I am getting the following error after running this command to build a taxonomy database.
docker run --rm -it -v $(pwd):$(pwd) -w $(pwd) nghoffman/taxtastic:0.8.3 taxit -vvv new_database taxonomy.db
docker run --rm -it -v $(pwd):$(pwd) -w $(pwd) nghoffman/taxtastic:0.8.3 taxit -vvv new_database taxonomy.db INFO ncbi 507 downloading ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdmp.zip to /Users/nwaters/GitHub/conditioning/mitre/taxdmp.zip INFO ncbi 230 Clobbering database tables 2021-10-08 20:33:33,019 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1 INFO log 109 SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1 2021-10-08 20:33:33,019 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,019 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1 INFO log 109 SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1 2021-10-08 20:33:33,020 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,021 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("ranks") INFO log 109 PRAGMA table_info("ranks") 2021-10-08 20:33:33,021 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,025 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("source") INFO log 109 PRAGMA table_info("source") 2021-10-08 20:33:33,025 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,027 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("nodes") INFO log 109 PRAGMA table_info("nodes") 2021-10-08 20:33:33,027 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,028 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("names") INFO log 109 PRAGMA table_info("names") 2021-10-08 20:33:33,028 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,028 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("merged") INFO log 109 PRAGMA table_info("merged") 2021-10-08 20:33:33,028 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,029 INFO sqlalchemy.engine.base.Engine DROP TABLE source INFO log 109 DROP TABLE source 2021-10-08 20:33:33,029 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,054 INFO sqlalchemy.engine.base.Engine COMMIT INFO log 109 COMMIT 2021-10-08 20:33:33,055 INFO sqlalchemy.engine.base.Engine DROP TABLE ranks INFO log 109 DROP TABLE ranks 2021-10-08 20:33:33,055 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,089 INFO sqlalchemy.engine.base.Engine COMMIT INFO log 109 COMMIT INFO ncbi 233 Creating database tables 2021-10-08 20:33:33,092 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("ranks") INFO log 109 PRAGMA table_info("ranks") 2021-10-08 20:33:33,092 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,096 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("source") INFO log 109 PRAGMA table_info("source") 2021-10-08 20:33:33,098 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,100 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("nodes") INFO log 109 PRAGMA table_info("nodes") 2021-10-08 20:33:33,100 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,100 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("names") INFO log 109 PRAGMA table_info("names") 2021-10-08 20:33:33,100 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,101 INFO sqlalchemy.engine.base.Engine PRAGMA table_info("merged") INFO log 109 PRAGMA table_info("merged") 2021-10-08 20:33:33,101 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,102 INFO sqlalchemy.engine.base.Engine CREATE TABLE ranks ( rank VARCHAR NOT NULL, height INTEGER NOT NULL, PRIMARY KEY (rank), UNIQUE (height) ) INFO log 109 CREATE TABLE ranks ( rank VARCHAR NOT NULL, height INTEGER NOT NULL, PRIMARY KEY (rank), UNIQUE (height) ) 2021-10-08 20:33:33,102 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,124 INFO sqlalchemy.engine.base.Engine COMMIT INFO log 109 COMMIT 2021-10-08 20:33:33,127 INFO sqlalchemy.engine.base.Engine CREATE TABLE source ( id INTEGER NOT NULL, name VARCHAR, description VARCHAR, PRIMARY KEY (id), UNIQUE (name) ) INFO log 109 CREATE TABLE source ( id INTEGER NOT NULL, name VARCHAR, description VARCHAR, PRIMARY KEY (id), UNIQUE (name) ) 2021-10-08 20:33:33,127 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,167 INFO sqlalchemy.engine.base.Engine COMMIT INFO log 109 COMMIT 2021-10-08 20:33:33,170 INFO sqlalchemy.engine.base.Engine CREATE TABLE nodes ( tax_id VARCHAR NOT NULL, parent_id VARCHAR, rank VARCHAR, embl_code VARCHAR, division_id VARCHAR, source_id INTEGER, is_valid BOOLEAN, PRIMARY KEY (tax_id), FOREIGN KEY(rank) REFERENCES ranks (rank), FOREIGN KEY(source_id) REFERENCES source (id), CHECK (is_valid IN (0, 1)) ) INFO log 109 CREATE TABLE nodes ( tax_id VARCHAR NOT NULL, parent_id VARCHAR, rank VARCHAR, embl_code VARCHAR, division_id VARCHAR, source_id INTEGER, is_valid BOOLEAN, PRIMARY KEY (tax_id), FOREIGN KEY(rank) REFERENCES ranks (rank), FOREIGN KEY(source_id) REFERENCES source (id), CHECK (is_valid IN (0, 1)) ) 2021-10-08 20:33:33,170 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,193 INFO sqlalchemy.engine.base.Engine COMMIT INFO log 109 COMMIT 2021-10-08 20:33:33,193 INFO sqlalchemy.engine.base.Engine CREATE INDEX ix_nodes_parent_id ON nodes (parent_id) INFO log 109 CREATE INDEX ix_nodes_parent_id ON nodes (parent_id) 2021-10-08 20:33:33,193 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,221 INFO sqlalchemy.engine.base.Engine COMMIT INFO log 109 COMMIT 2021-10-08 20:33:33,222 INFO sqlalchemy.engine.base.Engine CREATE TABLE names ( tax_id VARCHAR NOT NULL, tax_name VARCHAR NOT NULL, unique_name VARCHAR, name_class VARCHAR NOT NULL, source_id INTEGER, is_primary BOOLEAN, is_classified BOOLEAN, PRIMARY KEY (tax_id, tax_name, name_class), FOREIGN KEY(tax_id) REFERENCES nodes (tax_id) ON DELETE CASCADE, FOREIGN KEY(source_id) REFERENCES source (id), CHECK (is_primary IN (0, 1)), CHECK (is_classified IN (0, 1)) ) INFO log 109 CREATE TABLE names ( tax_id VARCHAR NOT NULL, tax_name VARCHAR NOT NULL, unique_name VARCHAR, name_class VARCHAR NOT NULL, source_id INTEGER, is_primary BOOLEAN, is_classified BOOLEAN, PRIMARY KEY (tax_id, tax_name, name_class), FOREIGN KEY(tax_id) REFERENCES nodes (tax_id) ON DELETE CASCADE, FOREIGN KEY(source_id) REFERENCES source (id), CHECK (is_primary IN (0, 1)), CHECK (is_classified IN (0, 1)) ) 2021-10-08 20:33:33,222 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,248 INFO sqlalchemy.engine.base.Engine COMMIT INFO log 109 COMMIT 2021-10-08 20:33:33,249 INFO sqlalchemy.engine.base.Engine CREATE INDEX ix_names_tax_id_is_primary ON names (tax_id, is_primary) INFO log 109 CREATE INDEX ix_names_tax_id_is_primary ON names (tax_id, is_primary) 2021-10-08 20:33:33,249 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,264 INFO sqlalchemy.engine.base.Engine COMMIT INFO log 109 COMMIT 2021-10-08 20:33:33,266 INFO sqlalchemy.engine.base.Engine CREATE TABLE merged ( old_tax_id VARCHAR NOT NULL, new_tax_id VARCHAR, PRIMARY KEY (old_tax_id), FOREIGN KEY(new_tax_id) REFERENCES nodes (tax_id) ON DELETE CASCADE ) INFO log 109 CREATE TABLE merged ( old_tax_id VARCHAR NOT NULL, new_tax_id VARCHAR, PRIMARY KEY (old_tax_id), FOREIGN KEY(new_tax_id) REFERENCES nodes (tax_id) ON DELETE CASCADE ) 2021-10-08 20:33:33,266 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,306 INFO sqlalchemy.engine.base.Engine COMMIT INFO log 109 COMMIT 2021-10-08 20:33:33,307 INFO sqlalchemy.engine.base.Engine CREATE INDEX ix_merged_old_tax_id ON merged (old_tax_id) INFO log 109 CREATE INDEX ix_merged_old_tax_id ON merged (old_tax_id) 2021-10-08 20:33:33,307 INFO sqlalchemy.engine.base.Engine () INFO log 109 () 2021-10-08 20:33:33,344 INFO sqlalchemy.engine.base.Engine COMMIT INFO log 109 COMMIT INFO ncbi 365 loading ranks INFO ncbi 373 loading nodes INFO ncbi 379 loading names Traceback (most recent call last): File "/usr/local/bin/taxit", line 22, in <module> sys.exit(main(sys.argv[1:])) File "/usr/local/lib/python2.7/site-packages/taxtastic/scripts/taxit.py", line 49, in main return action(arguments) File "/usr/local/lib/python2.7/site-packages/taxtastic/subcommands/new_database.py", line 86, in action ncbi_loader.load_archive(zfile) File "/usr/local/lib/python2.7/site-packages/taxtastic/ncbi.py", line 382, in load_archive self.load_table('names', rows=names_rows) File "/usr/local/lib/python2.7/site-packages/taxtastic/ncbi.py", line 343, in load_table cur.executemany(cmd, itertools.islice(rows, limit)) sqlite3.IntegrityError: UNIQUE constraint failed: names.tax_id, names.tax_name, names.name_class
It looks like older versions work: I ended up trying the 2018-12-01 version arbitrarily, and that was fine:
sudo docker run --rm -it -v $(pwd):$(pwd) -w $(pwd) nghoffman/taxtastic:0.8.3 taxit -vvv new_database -u ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_archive/new_taxdump_2018-12-01.zip
Trying to clean up some issues - I suspect that you have either solved this or moved on, but hopefully this issue is fixed in v0.10.0 - if not feel free to reopen.
Hello, I am getting the following error after running this command to build a taxonomy database.
It looks like older versions work: I ended up trying the 2018-12-01 version arbitrarily, and that was fine: