Closed charlesfoster closed 5 months ago
taxonkit reformat
can output taxids.
$ taxonkit reformat -h
-t, --show-lineage-taxids show corresponding taxids of reformated lineage
$ echo 562 \
| taxonkit reformat -I 1 -t -r NA -R 0
562 Bacteria;Pseudomonadota;Gammaproteobacteria;Enterobacterales;Enterobacteriaceae;Escherichia;Escherichia coli 2;1224;1236;91347;543;561;562
So you just need to separate them into multiple columns.
echo 562 \
| taxonkit reformat -I 1 -t -r NA -R 0 \
| csvtk -H -t sep -f 2 -s ';' -R \
| csvtk -H -t sep -f 2 -s ';' -R \
| csvtk add-header -t -n "taxid,kingdom,phylum,class,order,family,genus,species,kingdom_taxid,phylum_taxid,class_taxid,order_taxid,family_taxid,genus_taxid,species_taxid" \
| csvtk pretty -t
taxid kingdom phylum class order family genus species kingdom_taxid phylum_taxid class_taxid order_taxid family_taxid genus_taxid species_taxid
----- -------- -------------- ------------------- ---------------- ------------------ ----------- ---------------- ------------- ------------ ----------- ----------- ------------ ----------- -------------
562 Bacteria Pseudomonadota Gammaproteobacteria Enterobacterales Enterobacteriaceae Escherichia Escherichia coli 2 1224 1236 91347 543 561 562
Perfect! Thank you for the swift and helpful response.
Thanks for the great tool. I'm currently using it like so, based on the wiki:
$ taxonkit lineage taxids.txt | taxonkit reformat -I 1 -t -r NA -R 0 | csvtk -H -t cut -f 1,3 | csvtk -H -t sep -f 2 -s ';' -R | csvtk add-header -t -n taxid,kingdom,phylum,class,order,family,genus,species | csvtk pretty -t | head
Output:
Is there an easy option I am missing to also add in the corresponding taxids of each of the desired ranks into their own columns?
When I run the following, for example, I get the taxids for each rank:
However, I'm just not sure how to take this to the next step, and also combine it with my initial commands. Ideally, I would be able to get the headers "taxid,kingdom,phylum,class,order,family,genus,species,kingdom_taxid,phylum_taxid,class_taxid,order_taxid,family_taxid,genus_taxid,species_taxid".
If need be I will write a longer script (e.g. python) to do this step, but just checking to make sure I'm not missing something using your great tools.
Thanks.