wireservice / csvkit

A suite of utilities for converting to and working with CSV, the king of tabular file formats.
https://csvkit.readthedocs.io
MIT License
5.9k stars 605 forks source link

Question: parsing text column-wise #1243

Closed kiteloopdesign closed 2 months ago

kiteloopdesign commented 2 months ago

Hi, please allow me to ask a question although I do not think this is possible with the current set of csv utilities. Is it possible to parse some string column wise? For instance, this csv

col1, col2, col3
str1, str2-remove, str3

csvparse -c 2 --regexp 's/str2-remove/str2/'

col1, col2, col3
str1, str2, str3

Thanks

jpmckinney commented 2 months ago

No - that tool was requested here, but csvkit is not adding new tools: https://github.com/wireservice/csvkit/issues/1057

That said you can use e.g.

printf 'col1, col2, col3\nstr1, str2-remove, str3' | csvsql --query "SELECT col1, REPLACE(\" col2\", '-remove', '') AS \" col2\", \" col3\" FROM stdin"

The weird quoting is because your CSV for some reason has spaces at the start of each cell.

kiteloopdesign commented 2 months ago

Thanks a lot for your quick reply and for the amazing toolset!

jpmckinney commented 2 months ago

Btw, the unmaintained csvmedkit package has a "csvsed" command. I haven't tried it: https://pypi.org/project/csvmedkit/

jpmckinney commented 2 months ago

I've reopened #1057 for consideration.