Closed stas-malavin closed 1 year ago
csvtk is enough for mapping accession2taxid.
Retrieving accession2taxid data of accessions of interest.
cat prot.accession2taxid.gz \
| csvtk grep -t -f accession.version -P acc.txt \
| csvtk cut -t -f accession.version,taxid \
| csvtk del-header -t \
> prot.accession2taxid.tsv \
Querying taxid for each accession.
cat data.tsv \
| csvtk mutate -H -t -f 1 \
| csvtk replace -H -t -f 2 -k prot.accession2taxid.tsv -p '(.+)' -r '{kv}'
That's beautiful, thank you! I really need to master csvtk!
Hi, I'd love to have a possibility to assign taxid's to accession numbers locally using NCBI's
nucl_gb.accession2taxid.gz
,nucl_wgs.accession2taxid.gz
, andprot.accession2taxid.gz
. This is possible with anR
package taxonomizr.