chiulab / surpi

SURPI
chiulab.ucsf.edu/surpi
Other
82 stars 47 forks source link

How to make surpi db for new NCBI accession2taxid files #36

Open DooYal opened 2 years ago

DooYal commented 2 years ago

I modified the create_taxonomy shell and python scripts with replacing "gi_taxid_nucl.dmp.gz" into ”nucl_gb.accession2taxid.gz“ and also replacing such filename for protein. However, when it comes to the step 4 of create_taxonomy shell script, there is always an error:

Starting creation of taxonomy SQLite databases... Creating names_nodes_scientific.db... Creating taxid_prot.db... Traceback (most recent call last): File "/mnt/upan/share/surpi/create_taxonomy_db.py", line 76, in c.execute("INSERT INTO gi_taxid VALUES ("+line[0]+","+line[1]+")") sqlite3.OperationalError: no such column: accession

I found that there is actually "accession" in the header of corresponding files, so, where should I modify to correctly build the surpi protein and nucleotide db?