Closed srosales712 closed 3 years ago
Oh I'm so sorry, I missed this issue.
It seems these accessions are not found in the prot.accession2taxid.gz
.
Oh I know, they are NUCLEOTIDE SEQUENCE,
https://www.ncbi.nlm.nih.gov/search/all/?term=AB111947.1
You need nucl_gb.accession2taxid.gz
, prot.accession2taxid.gz
is for protein records...
Hi, I'm trying to go from Blast output to taxIDs. I parsed the Blast accession number into a single file and then wanted to verify that the accession numbers were in the prot.accession2taxid.gz database.
To do this I ran:
pigz -dc /taxdb2020/prot.accession2taxid.gz | csvtk grep -t -f accession.version -P acc.txt > output.txt
head acc.txt
AB111947.1 AB111947.1 MK072403.1 MK072403.1 MK072403.1My output.txt file comes out empty - I like to know if there are any suggestions for going from a list of accessions to TaxID?