shenwei356 / csvtk

A cross-platform, efficient and practical CSV/TSV toolkit in Golang
http://bioinf.shenwei.me/csvtk
MIT License
992 stars 84 forks source link

--lazy-quotes does not work for fields starting with quotes but not ending with #260

Closed shenwei356 closed 9 months ago

shenwei356 commented 9 months ago
$ echo -ne "a,\"Cellvibrio\" Winogradsky 1929,c\n" 
a,"Cellvibrio" Winogradsky 1929,c

$ echo -ne "a,\"Cellvibrio\" Winogradsky 1929,c\n" | csvtk pretty -l
a   Cellvibrio" Winogradsky 1929,c 
-   -------------------------------

# `csvtk fix` did not fix this.
$ echo -ne "a,\"Cellvibrio\" Winogradsky 1929,c\n" | csvtk fix -l | csvtk pretty -l
[INFO] the maximum number of columns in all 1 rows: 2
a   Cellvibrio" Winogradsky 1929,c 
-   -------------------------------

However, when the quotes are only in the middle of fields, theres' no problem.

$ echo -ne "a,Cellvibrio\" Winogradsky 1929,c\n"
a,Cellvibrio" Winogradsky 1929,c
$ echo -ne "a,Cellvibrio\" Winogradsky 1929,c\n" | csvtk pretty -l
a   Cellvibrio" Winogradsky 1929   c
-   ----------------------------   -

According to this answer, the CSV format is not valid ~

shenwei356 commented 9 months ago

Added two commands:

Try it:

$ echo -ne "a,\"Cellvibrio\" Winogradsky 1929,c\n" 
a,"Cellvibrio" Winogradsky 1929,c

$ echo -ne "a,\"Cellvibrio\" Winogradsky 1929,c\n"  | csvtk fix-quotes | csvtk cut -f 1-
a,"""Cellvibrio"" Winogradsky 1929",c

$ echo -ne "a,\"Cellvibrio\" Winogradsky 1929,c\n"  | csvtk fix-quotes | csvtk pretty 
a   "Cellvibrio" Winogradsky 1929   c
-   -----------------------------   -

$ echo -ne "a,\"Cellvibrio\" Winogradsky 1929,c\n"  | csvtk fix-quotes | csvtk del-quotes 
a,"Cellvibrio" Winogradsky 1929,c