andreyvit / json-diff

Structural diff for JSON files
MIT License
1.14k stars 134 forks source link

Floats with zeros after decimal point and integers are treated as identical #114

Open NateEag opened 1 year ago

NateEag commented 1 year ago

This isn't really a bug, but it is a behavior that caught me off-guard, so I thought it was worth mentioning. The right resolution for this may well be "Works as intended."

I recently learned the hard way that two blobs of raw JSON can be different but be reported as identical by json-diff. A minimal reproduction is below:

git clone git@github.com:andreyvit/json-diff.git
cd json-diff
npm ci
echo '{"amount": 42.0}' > old.json
echo '{"amount": 42}' > new.json
# No output from the below
./bin/json-diff.js old.json new.json

Semantically, 42.0 === 42, so this is arguably the Right Thing.

The two documents are not precisely identical, however.

In my case, I was comparing them because I had two documents that were returning different SHA256 hashes and I wanted to see how they differed.

Being told they were identical by json-diff threw me off for a moment.

I happened to already know they were differing lengths, so it didn't take me long to figure out what was happening.

I could easily see myself having been thrown off badly if I'd had a little less information before I used json-diff on my documents.

As I said at the start - this isn't exactly a genuine bug, but it really surprised me, so I thought I'd mention it.

NateEag commented 1 year ago

Additional note: while NodeJS's JSON.parse() yields identical results for the two above documents, some tools do not, such as PHP 8.1's json_decode:

<?php

$old = json_decode('{"amount": 42.0}');
$new = json_decode('{"amount": 42}');

if ($old->amount === $new->amount) {
    echo "equal\n";
} else {
    echo "not equal\n";
}
// outputs 'not equal'

That's because unlike JS, PHP has distinct float and integer types and it parses the two fields as two different types due to the decimal point's presence/absence.

That may be the Wrong Thing (as is true of so much of PHP), but it's a relevant point, since it is a widely-deployed web programming language that does not treat the sample documents as semantically identical.

sora-blue commented 1 year ago

Here are similar cases with numbers that is too large, e.g. 282231934395265024. After running json-diff, it will be round down to 282231934395265020 in the output. I suppose it is because precision of floating number in JavaScript is limited?