shenwei356 / csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang
http://bioinf.shenwei.me/csvtk
MIT License
999 stars 84 forks source link

[csv2json] format as column array #199

Closed mando7 closed 1 year ago

mando7 commented 1 year ago

Hi, Is it possible to obtain this output ?

From csv

id,age,gender
1,39,M
2,25,M

To json column array minify

[{"id":["1","2"]},{"age":["39","25"]},{"gender":["M","M"]}]

OR json column array

[
    {
        "id":[
            "1",
            "2"
        ]
    },
    {
        "age":[
            "39",
            "25"
        ]
    },
    {
        "gender":[
            "M",
            "M"
        ]
    }
]

Thank you in advance for your reply.

shenwei356 commented 1 year ago

Sorry, not supported.

mando7 commented 1 year ago

thank you for your answer, maybe in a next version ?

tetedange13 commented 1 year ago

Hi @mando7,

I could achieve something close to what you want :

Input CSV

==> test.csv <==
id,age,gender
1,39,M
2,25,M
3,36,F

Command-lines

One-liner

csvtk gather -k item -v value -Ff "*" test.csv |
    csvtk summary -s';' -g item -f value:collapse |
    csvtk del-header | csvtk transpose | csvtk csv2json

Detailed steps with corresponding output:

1) Use csvtk gather on all fields (= -Ff "*" ) to create key-value pairs

item     value
------   -----
id       1
age      39
gender   M
id       2
age      25
gender   M
id       3
age      36
gender   N


2) Use csvtk summary to group by item and collapse all values of each "field" into a ';'-delimited list (all id values, all age values ...) (this step can also be achieve through csvtk fold -s';' -f item -v value, but not sure what difference it makes)

item     value:collapse
------   --------------
age      39;25;36
gender   M;M;N
id       1;2;3


3) Remove useless header, transpose back and convert to JSON

Final output

[
  {
    "id": "1;2;3",
    "age": "39;25;36",
    "gender": "M;M;F"
  }
]


Hope this helped ! Have a nice day, Felix.