larsyencken / csvdiff

Generate a diff between two tabular datasets expressed in CSV files.
BSD 3-Clause "New" or "Revised" License
132 stars 31 forks source link

Add support to tab separated csv files #50

Open pedroportasvieira opened 5 years ago

pedroportasvieira commented 5 years ago

Hi,

So, was using this plugin and could not use it with tab separated csv files. All good for comma separated ones, but not tab separated ones.

Is viable to do this support based on the current implementation?

PS: If so, and if not a priority, i can try to do a PR for this.

Thanks in advance

larsyencken commented 5 years ago

Hi Pedro, you have to use the --sep argument. You can check it by running csvdiff --help.

Tabs are still not easy, on the terminal you have to type:

csvdiff --sep='

and then at that moment, to enter a tab character, you press <ctrl+v> then

in the terminal, and it will output one. Then you close the quote and add the other arguments. Give it a try. On Wed, 5 Dec 2018 at 11:30, Pedro Vieira wrote: > Hi, > > So, was using this plugin and could not use it with tab separated csv > files. All good for comma separated ones, but not tab separated ones. > > Is viable to do this support based on the current implementation? > > PS: If so, and if not a priority, i can try to do a PR for this. > > Thanks in advance > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > , or mute the thread > > . >
pedroportasvieira commented 5 years ago

Also tried it but it gives an error.

ERROR: CSV parse error on line 2

larsyencken commented 5 years ago

Hmm... it's definitely possible on the command line, as well as via API, although I agree we could use a command-line argument to make it easier.

In the meantime, you can also convert to comma-separated and work with that. A package called csvkit has tools that are good for that.

e.g. to convert from tsv to csv

pip install csvkit csvcut -t myfile.tsv >myfile.csv

On Mon, 10 Dec 2018 at 16:15, Pedro Vieira notifications@github.com wrote:

Also tried it but it gives and error.

ERROR: CSV parse error on line 2

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/larsyencken/csvdiff/issues/50#issuecomment-445850898, or mute the thread https://github.com/notifications/unsubscribe-auth/AACMrEYIof4x4CJUoQXxxjBeeZhzM-X1ks5u3np9gaJpZM4ZCcGI .

simbo1905 commented 5 years ago

I tested out passing a tab as --sep and it works fine although there is a bit of a challenge to pass a tab as a commandline arg with bash due to it wanting to do tab completion. A quick google found --sep $'\t' which works fine as per the output below:

 2019-05-22 05:59:15 ⌚  |2.4.4| MacBook-Pro-3 in ~/projects/csvdiff
± |master ?:27 ✗| → head a.tsv b.tsv 
==> a.tsv <==
id  name    amount
1   bob 20
2   eva 63
3   sarah   7
4   jeff    19
6   fred    10

==> b.tsv <==
id  name    amount
1   bob 20
2   eva 63
3   james   7
4   jeff    19
6   fred    10

 2019-05-22 05:59:17 ⌚  |2.4.4| MacBook-Pro-3 in ~/projects/csvdiff
± |master ?:27 ✗| → csvdiff --style=summary --sep $'\t' id a.tsv b.tsv 
0 rows removed (0.0%)
0 rows added (0.0%)
1 rows changed (20.0%)