aswinkarthik / csvdiff

A fast diff tool for comparing csv files
https://aswinkarthik.github.io/csvdiff/
MIT License
532 stars 57 forks source link

Big File compare slowly #42

Open xidianwlc opened 4 years ago

xidianwlc commented 4 years ago

when a file is 1TB then csvdiff was slow and cpu used too much

you can use binary diff algorighm

if xxHash(src multilines) == xxHash(dst multilines) then continue else compare line by line

aswinkarthik commented 4 years ago

Will you be able to share some stats about this:

  1. How big was base and delta file?
  2. Did it finish? If so, how much time did it take?