larsyencken / csvdiff

Generate a diff between two tabular datasets expressed in CSV files.
BSD 3-Clause "New" or "Revised" License
132 stars 31 forks source link

Approximately equal numeric fields #2

Open dan-blanchard opened 9 years ago

dan-blanchard commented 9 years ago

It'd be really nice if there was some way to say "Consider numeric values equivalent if they're equal to a certain number of decimal places."

Here's a simple function for doing so from Stack Overflow:

def nearly_equal(a, b, sig_fig=5):
    return (a == b or 
            int(a * 10**sig_fig) == int(b * 10**sig_fig))
larsyencken commented 9 years ago

That's an excellent suggestion.

Out of curiosity, what's your use case? What types of things are you comparing?

dan-blanchard commented 9 years ago

I'm comparing the output of a system that generates many columns of floats. Because of floating point precision nonsense, when I run it on different machines the outputs differ after the 9th decimal and I don't care about those differences. I'm trying to find more substantial differences that would indicate an error.

ivansabik commented 8 years ago

This issue is really old is really old but I find myself also using that a lot. My specific case is I use a lot of times Excel for editing the new file I want to compare against to (for instance I remove columns so that it matches the other one). Our friend Excel likes adding .00 (or removing those not sure exactly now) and since string/text comparison is being done currently it accounts for it as modified. Still working on adding other feature I use a lot (xlsx export) but once I'm done with that might take a look at this one.