shenwei356 / csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang
http://bioinf.shenwei.me/csvtk
MIT License
1.01k stars 84 forks source link

Horizontal file concatenation #139

Open aborruso opened 3 years ago

aborruso commented 3 years ago

Hi, is there a way to concatenate two files horizontally?

Starting from

field1,field2
0,a
1,b
3,a

and

field3,field4
0,a
1,b
3,a

obtain

field1,field2,field3,field4
0,a,0,a
1,b,1,b
3,a,3,a

Thank you

shenwei356 commented 3 years ago

csvtk join? or shell paste?

I'm not sure whether the two files have the same number of rows or share common fields values.

tseemann commented 3 years ago

I had a similar problem last night which I could not solve using csvtk

I want to merge two CSV files They share an ID column (key) The do NOT have any IDs in common They have some columns in common The have some columns different Columns are in different order

join/merge --outer is not correct concat i hoped would work but it does not

Example:

# FILE 1
ID,A,B,C
1,a,b,c
2,d,e,f

# FILE 2
ID,C,A,D,E
3,g,h,i,j
4,k,l,m,n
5,,,,

# "MERGE"
ID,A,B,C,D,E
1,a,b,c,,
2,d,e,f,,
3,h,,g,i,j
4,l,,k,m,n
5,,,,,
aborruso commented 3 years ago

I'm not sure whether the two files have the same number of rows or share common fields values.

In my example they have the same number of rows and they not share any fields. I want only to know if it's possible to do in csvtk whate paste does (but it's not aware of CSV format).

Thank you

shenwei356 commented 3 years ago

In my example they have the same number of rows and they not share any fields. I want only to know if it's possible to do in csvtk what paste does (but it's not aware of CSV format).

Sorry, it does not support this.

shenwei356 commented 3 years ago

I had a similar problem last night which I could not solve using csvtk

Again, let's use transpose

csvtk transpose f1.csv > f1.t.csv
csvtk transpose f2.csv > f2.t.csv

csvtk join f1.t.csv f2.t.csv -O | csvtk transpose | csvtk pretty 
ID   A   B   C    D   E
--   -   -   --   -   -
1    a   b   c        
2    d   e   f        
3    h       g    i   j
4    l       k    m   n
5
shenwei356 commented 3 years ago

In my example they have the same number of rows and they not share any fields. I want only to know if it's possible to do in csvtk what paste does (but it's not aware of CSV format).

There's a dirty solution by adding a column of row number, which can be removed after joining.

$ csvtk mutate f1.csv -n n | csvtk replace -f n -p '.+' -r '{nr}'
field1,field2,n
0,a,1
1,b,2
3,a,3

$ csvtk mutate f1.csv -n n | csvtk replace -f n -p '.+' -r '{nr}' > f1.n.csv
$ csvtk mutate f2.csv -n n | csvtk replace -f n -p '.+' -r '{nr}' > f2.n.csv

$ csvtk join -f n f1.n.csv f2.n.csv | csvtk cut -f '-n'
field1,field2,field3,field4
0,a,0,a
1,b,1,b
3,a,3,a
aborruso commented 3 years ago

@shenwei356 could I add the "paste" command as csvtk feature request?

shenwei356 commented 3 years ago

You can just use shell paste

$ paste -d , a.csv b.csv 
field1,field2,field3,field4
0,a,0,a
1,b,1,b
3,a,3,a

Or csvtk transpose + concat

$ csvtk concat -H <(csvtk transpose a.csv) <(csvtk transpose b.csv) | csvtk transpose 
field1,field2,field3,field4
0,a,0,a
1,b,1,b
3,a,3,a
aborruso commented 3 years ago

You can just use shell paste

I think no, because paste is no CSV aware and if I have CSV that has a comma inside a field "a classic, content" I think I will have problems using paste.

I'm asking it as a feature proposal because it does not exist a cli tool that has paste command CSV aware.