shenwei356 / csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang
http://bioinf.shenwei.me/csvtk
MIT License
992 stars 84 forks source link

transpose help #239

Closed MostafaYA closed 1 year ago

MostafaYA commented 1 year ago

Prerequisites

Describe your issue

Hi, I have the this example example.txt id pattern value
s1 a 1
s1 b 2
s1 c 3
s1 d 4
s1 e 5
s1 f 6
s1 g 7
s1 h 8
s1 i 9
s1 j 10
s2 a 1
s2 b 2
s2 c 3
s2 d 4
s2 e 5
s2 f 6
s2 g 7
s2 h 8
s2 i 9
s2 j 10
s3 a 1
s3 b 2
s3 c 3
s3 d 4
s3 e 5
s3 f 6
s3 g 7
s3 h 8
s3 i 9
s3 j 10
s3 k 11

and want to get an output like the one below. Pattern as a header, id as row name.

pattern a b c d e f g h i j
s1 1 2 3 4 5 6 7 8 9 10
s2 1 2 3 4 5 6 7 8 9 10
s3 1 2 3 4 5 6 7 8 9 10

I am doing the following that looks a bit inelegant solution. I beleive there might be a functionality within csvtk to do this better but am not aware of it. Can you please guide?

cat example.tsv  | csvtk split -t -f 1 -o example-split

for i in example-split/*.tsv; do 
  name=`basename $i`;
  cat $i | csvtk rename -t -f 3 -n $name -o example-split/renamed_"$name"; 
done 

for i in example-split/renamed_* ; do 
  cat $i | csvtk cut -t -f 2,3 | csvtk transpose -t -o "$i"_transposed ; 
done

csvtk concat  -t example-split/renamed_*transposed | csvtk pretty -t -S bold
this gives these results pattern a b c d e f g h i j
stdin-s1.tsv 1 2 3 4 5 6 7 8 9 10
stdin-s2.tsv 1 2 3 4 5 6 7 8 9 10
stdin-s3.tsv 1 2 3 4 5 6 7 8 9 10

Thank you

I'm grateful to users who have greatly helped to report bugs and suggested new features.

I may respond to issues or fix bugs quickly, but I usually implement new features periodically (two or more weeks).

MostafaYA commented 1 year ago

Just realised another solution outside csvtk .

datamash -s --header-in --header-out crosstab 1,2 < example.tsv

It would be however nice to have it with csvtk to reduce dependencies

shenwei356 commented 1 year ago

I will implement it.

91

236

shenwei356 commented 1 year ago

Implemented. Use binaries here: https://github.com/shenwei356/csvtk/issues/91#issuecomment-1674416824

$ csvtk spread -t -k pattern -v value example.txt | csvtk csv2md -t
id a b c d e f g h i j k
s1 1 2 3 4 5 6 7 8 9 10
s2 1 2 3 4 5 6 7 8 9 10
s3 1 2 3 4 5 6 7 8 9 10 11