larsyencken / csvdiff

Generate a diff between two tabular datasets expressed in CSV files.
BSD 3-Clause "New" or "Revised" License
132 stars 31 forks source link

does not detect duplicated lines #54

Open simbo1905 opened 5 years ago

simbo1905 commented 5 years ago

Summary

Duplicated rows appear to be ignored and the user is told that files are identical.

Minimal Reproducible Example

 2019-05-22 06:03:45 ⌚  |2.4.4| MacBook-Pro-3 in ~/projects/csvdiff
± |master ?:27 ✗| → head a.csv duplicate.csv 
==> a.csv <==
id,name,amount
1,bob,20
2,eva,63
3,sarah,7
4,jeff,19
6,fred,10

==> duplicate.csv <==
id,name,amount
1,bob,20
2,eva,63
3,sarah,7
4,jeff,19
4,jeff,19
6,fred,10

 2019-05-22 06:04:10 ⌚  |2.4.4| MacBook-Pro-3 in ~/projects/csvdiff
± |master ?:27 ✗| → csvdiff --style=summary id a.csv duplicate.csv 
files are identical

 2019-05-22 06:04:25 ⌚  |2.4.4| MacBook-Pro-3 in ~/projects/csvdiff
± |master ?:27 ✗| →