aotimme / gocsv

Command-line CSV processing utility.
MIT License
198 stars 21 forks source link

Regex Replace Transformations #2

Closed AurielleP closed 7 years ago

AurielleP commented 7 years ago

I would like to request the ability to run basic regex transformation on a column.

Something along the format of:

--regexreplace "regex string to match" "replacement regex"

For example there are a few main way I generally use my regexes - 2 of which use capture groups:

Ex. 1 : I have a column that has only datafox urls and I would like, for any cell that matches the regex, to only give me the id or slug at the end:

--regexreplace "^http://datafox.com/.*/" "" Replacement with an empty string

--regexreplace "(http://datafox.com/.*/)(\w{24})" "$2" Replacement with the second capture group

--regexreplace "(last name), (first name)" "$2 $1" Formatting but swapping capture groups and adding a space in between - where all characters are literal except for the capture group # reference

--regexreplace ".*@gmail.com|.*@aol.com|.*@hotmail.com" "" Another example of replacement with empty string so that only cells without those matches would remain

aotimme commented 7 years ago

See: https://github.com/DataFoxCo/gocsv#replace