Open AlenaYoung opened 1 month ago
csvtk pretty
is for formatting readable format in terminal, the output is not tab or comma deleted file any more.
$ taxonkit lineage <(echo 9606) \
| taxonkit reformat -r NA -R 0 \
| csvtk -H -t cut -f 1,3 \
| csvtk -H -t sep -f 2 -s ';' -R \
| csvtk add-header -t -n taxid,kingdom,phylum,class,order,family,genus,species \
> taxid_out.csv
$ cat taxid_out.csv
taxid kingdom phylum class order family genus species
9606 Eukaryota Chordata Mammalia Primates Hominidae Homo Homo sapiens
$ csvtk pretty -t taxid_out.csv -S grid
+-------+-----------+----------+----------+----------+-----------+-------+--------------+
| taxid | kingdom | phylum | class | order | family | genus | species |
+=======+===========+==========+==========+==========+===========+=======+==============+
| 9606 | Eukaryota | Chordata | Mammalia | Primates | Hominidae | Homo | Homo sapiens |
+-------+-----------+----------+----------+----------+-----------+-------+--------------+
btw, -j 120
does not help.
-j, --threads int number of CPUs. 4 is enough (default 4)
Hi,
I hope to gain taxonomy info while running taxonkit and csvtk using
-t
, which is helpful for me to import the result into R. But R and excel seems to have trouble importing the result. Some lines such as (et al.2015) and Sedi can't be import effectively. 得到的R导入结果要么是全部集中在一行(read.csv),要么是得到超过结果的行数和列数(read.table)My script is as shown below:
taxonkit lineage taxid.txt -j 120 | taxonkit reformat -r NA -R 0 -j 120 | csvtk -H -t cut -f 1,3 | csvtk -H -t sep -f 2 -s ';' -R | csvtk add-header -t -n taxid,kingdom,phylum,class,order,family,genus,species | csvtk pretty -t -o taxid_out.csv
My R script is as shown below: test2 <- read.table("taxid_out.csv",header = TRUE)
The output file I get is as follows. taxid_out.csv
Any help will be much appreciated. Thank you in advance,
Alena