xlwings / jsondiff

Diff JSON and JSON-like structures in Python
MIT License
688 stars 86 forks source link

Broken result for diff('[{"hi": "wombat"}]','[]') #83

Open shoffmeister opened 1 week ago

shoffmeister commented 1 week ago

Running

import jsondiff
jsondiff.diff('[{"hi": "wombat"}]','[]')

yields

'[]'

which is unexpected - there clearly is a difference.

Also consider that

jsondiff.diff('[]', '[{"hi": "wombat"}]')

i.e. with the arguments simply swapped, yields

'[{"hi": "wombat"}]'

which is what I would expect from that diff.

corytodd commented 1 week ago

Interesting case for the default compact mode. Symmetric and explicit both work as expected. I wonder why we handle the zero similarity case as a replacement.

(Pdb) locals()
{'self': <jsondiff.CompactJsonDiffSyntax object at 0x10170aa00>, 'a': [{'hi': 'wombat'}], 'b': [], 's': 0.0, 'inserted': [], 'changed': {}, 'deleted': [(0, {'hi': 'wombat'})]}
(Pdb) l
241             :param changed: Elements changed in the original list.
242             :param deleted: Elements deleted from the original list.
243             :return: A dictionary representing the changes in a compact form.
244             """
245             if s == 0.0:
246  ->             return {replace: b} if isinstance(b, dict) else b
247             elif s == 1.0 and not (inserted or changed or deleted):
248                 return {}
249             else:
250                 d = changed
251                 if inserted:
shoffmeister commented 5 days ago

Interesting!

With respect to "explicit", I gather you refer to

import jsondiff
jsondiff.diff('[{"hi": "wombat"}]','[]', syntax="explicit")

?

For me, this also returns the unexpected []

I can confirm that

import jsondiff
jsondiff.diff('[{"hi": "wombat"}]','[]', syntax="symmetric")

returns the changed data.

corytodd commented 5 days ago

Sample here https://github.com/xlwings/jsondiff/compare/master...corytodd/issue-83

According to the tests, specifically this one, your example is returning the correct output. I agree that the result is unintuitive and I'm not sure of a good answer.

(venv) ➜  jsondiff git:(master) ✗ python3 -m pytest -k "TestSpecificIssue"
================================== test session starts ==================================
configfile: pyproject.toml
plugins: hypothesis-6.112.1
collected 30 items / 18 deselected / 12 selected                                        

tests/test_jsondiff.py ........F...                                               [100%]

======================================= FAILURES ========================================
________________________ TestSpecificIssue.test_issue[issue83_1] ________________________

self = <tests.test_jsondiff.TestSpecificIssue object at 0x105bc0790>
a = [{'hi': 'wombat'}], b = [], syntax = 'compact', expected = {delete: [0]}

    def test_issue(self,  a, b, syntax, expected):
        actual = diff(a, b, syntax=syntax)
>       assert actual == expected
E       assert [] == {delete: [0]}
E         
E         Use -v to get more diff

tests/test_jsondiff.py:185: AssertionError
================================ short test summary info ================================
FAILED tests/test_jsondiff.py::TestSpecificIssue::test_issue[issue83_1] - assert [] == {delete: [0]}
====================== 1 failed, 11 passed, 18 deselected in 0.27s ======================
shoffmeister commented 4 days ago

Just so that we are aligned:

❯ python -c "import jsondiff; print(jsondiff.diff('[{\"hi\": \"wombat\"}]','[]', syntax='symmetric'))"
['[{"hi": "wombat"}]', '[]']

❯ python -c "import jsondiff; print(jsondiff.diff('[{\"hi\": \"wombat\"}]','[]', syntax='compact'))"
[]

❯ python -c "import jsondiff; print(jsondiff.diff('[{\"hi\": \"wombat\"}]','[]', syntax='explicit'))"
[]

This is with the latest Python release (3.12.6) the latest jsondiff (2.2.1) on Arch Linux.

corytodd commented 4 days ago

Did you mean to use strings as your inputs? That would explain why symmetric and explicit are different for you.

corytodd@dev ~/code/jsondiff git:i83 ?1 ~1
> git rev-parse HEAD
31140ba8af02d068a5cca6cd123848a89dde1190
corytodd@dev ~/code/jsondiff git:i83 ?1 ~1
> git diff
diff --git a/jsondiff/__init__.py b/jsondiff/__init__.py
index 981ba7b..8083fbc 100644
--- a/jsondiff/__init__.py
+++ b/jsondiff/__init__.py
@@ -958,6 +958,7 @@ class JsonDiffer:
         :param fp: Optional file pointer to dump the diff to.
         :param exclude_paths: Optional list of string paths to exclude from the diff.
         """
+        print(f"diff: type(a)={type(a)}, type(b)={type(b)}, {self.options.syntax.__class__.__name__}")
         if not exclude_paths:
             exclude_paths = []
         if self.options.load:
corytodd@dev ~/code/jsondiff git:i83 ?1 ~1
> python -c "import jsondiff; print(jsondiff.diff('[{\"hi\": \"wombat\"}]','[]', syntax='symmetric'))"
diff: type(a)=<class 'str'>, type(b)=<class 'str'>, SymmetricJsonDiffSyntax
['[{"hi": "wombat"}]', '[]']
corytodd@dev ~/code/jsondiff git:i83 ?1 ~1
> python -c "import jsondiff; print(jsondiff.diff([{\"hi\": \"wombat\"}],[], syntax='symmetric'))"
diff: type(a)=<class 'list'>, type(b)=<class 'list'>, SymmetricJsonDiffSyntax
{delete: [(0, {'hi': 'wombat'})]}
shoffmeister commented 4 days ago

Ah, sorry, for the noise!

import jsondiff

print(jsondiff.diff([{"hi": "wombat"}], [], syntax="symmetric"))
print(jsondiff.diff([{"hi": "wombat"}], [], syntax="explicit"))
print(jsondiff.diff([{"hi": "wombat"}], [], syntax="compact")) # whoops, [] as the diff

behaves exactly as you outlined in your initial response.