shenwei356 / csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang
http://bioinf.shenwei.me/csvtk
MIT License
1.01k stars 84 forks source link

Append same header name from nth column to the last column. #197

Closed MostafaYA closed 2 years ago

MostafaYA commented 2 years ago

Prerequisites

Describe your issue

Hi, Thank you very much for this great tool. I am using the subcommand add-header to add headers to an automatically generated table. The thing is that the number of columns in this table is different every time it is created, and I want to append header name "Gene" to the table from the third column to the last column some thing like this sample name Gene Gene ... Gene
16S0524 ND 100.00 7 2 7
16S0525 ND 94.03 19 4 5

I do the following workaround

ncol=$(csvtk ncol -t $input)
ncol_gene=$(expr $ncol - 4)
header=$(for i in `seq $ncol_gene`; do echo -ne "Gene,"; done &&  echo "Gene")
cat  $input | csvtk cut -t  -f -`echo $ncol` | csvtk -t add-header -n sample,name,`echo $header` | csvtk csv2xlsx --tabs  -o output.excel

I wonder if such a feature already exists in csvtk and I am not aware of it.

Thank you

I'm grateful to users who have greatly helped to report bugs and suggested new features.

I may respond to issues or fix bugs quickly, but I usually implement new features periodically (two or more weeks).

shenwei356 commented 2 years ago

Here's a way, when no column names are given, add-header add fake colnames:

$ seq 5 | csvtk transpose -Ht
1       2       3       4       5

$ seq 5 | csvtk transpose -Ht \
    | csvtk add-header -t
[WARN] colnames not given, c1, c2, c3... will be used
c1      c2      c3      c4      c5
1       2       3       4       5

$ seq 5 | csvtk transpose -Ht \
    | csvtk add-header -t \
    | csvtk rename -t -f 1,2 -n sample,name
[WARN] colnames not given, c1, c2, c3... will be used
sample  name    c3      c4      c5
1       2       3       4       5

$ seq 5 | csvtk transpose -Ht \
    | csvtk add-header -t \
    | csvtk rename -t -f 1,2 -n sample,name \
    | csvtk rename2 -t -f -1,-2 -p '.+' -r Gene
[WARN] colnames not given, c1, c2, c3... will be used
sample  name    Gene    Gene    Gene
1       2       3       4       5
MostafaYA commented 2 years ago

Thank you very much for your quick reply. It worked perfectly. Another question: How can I use csvtk cut to discard only the last column?

shenwei356 commented 2 years ago

Hmm, no direct way, you have to know the number of columns first

$ csvtk cut -t -f 1-$(expr $(csvtk ncol -t data.tsv) - 1) data.tsv