avian2 / jsonmerge

Merge a series of JSON documents.
MIT License
214 stars 25 forks source link

jsonmerge fails assertion on nan float values #39

Closed mwes closed 5 years ago

mwes commented 5 years ago

Python's default JSON encoder allows for the encoding of extended values like nan, -inf, and inf. We have an external process that is recording float nan values as part of a measurement error. This is different from indicating that a value was not recorded, so we unfortunately can't just substitute null or None in this case.

When running these values through jsonmerge, we hit an assertion failure as part of the resolution process. This is because python correctly interprets float('nan') == float('nan') to be false, so the assert fails: https://github.com/avian2/jsonmerge/blob/master/jsonmerge/__init__.py#L117

Per IEEE 754 - python does not allow nan float values to be considered equal.

I would like to request an extension to the merge process that allows for the optional comparison of nan values and to consider them equal (e.g. math.isnan(base.val) == math.isnan(resolved). While not a normal/expected behavior, this is currently preventing us from using this (very useful!) library without improperly modifying our measurement data.

I suggest this be allowable as a flag for callers to override the comparison logic to consider nan values equal. In our specific case, if we see two JSON documents with nan values for the same key, we do want to consider them equal (vs. hard fail on the assert).

I would be happy to contribute to a PR on this and would be interested in hearing other thoughts on this. A simple test case to reproduce is below. Thanks!

from jsonmerge import merge

base = { 
    "foo": 1,
    "bar": float('nan')
}

head = { 
    "foo": 1,
    "bar": float('nan')
}

result = merge(base, head)
Traceback (most recent call last):
  File "json_merge_test.py", line 13, in <module>
    result = merge(base, head)
  File "py3/lib/python3.6/site-packages/jsonmerge/__init__.py", line 346, in merge
    return merger.merge(base, head)
  File "py3/lib/python3.6/site-packages/jsonmerge/__init__.py", line 301, in merge
    return walk.descend(schema, base, head, meta).val
  File "py3/lib/python3.6/site-packages/jsonmerge/__init__.py", line 78, in descend
    rv = self.work(strategy, schema, *args, **opts)
  File "py3/lib/python3.6/site-packages/jsonmerge/__init__.py", line 123, in work
    rv = strategy.merge(self, base, head, schema, meta, objclass_menu=self.merger.objclass_menu, **kwargs)
  File "py3/lib/python3.6/site-packages/jsonmerge/strategies.py", line 270, in merge
    base[k] = walk.descend(subschema, base.get(k), v, meta)
  File "py3/lib/python3.6/site-packages/jsonmerge/__init__.py", line 78, in descend
    rv = self.work(strategy, schema, *args, **opts)
  File "py3/lib/python3.6/site-packages/jsonmerge/__init__.py", line 117, in work
    assert base.val == resolved
avian2 commented 5 years ago

Hi. Interesting edge case you found! Thanks for reporting it.

I think the solution is simpler than what you propose. The code doesn't actually need to do arithmetic comparisons, so there is no need to make this configurable. The assertion failures you see come from some code that checks if internally some reference book keeping is consistent (i.e. if two things reference the same object). Therefore using the "==" operator there was wrong. It should be "is", which checks if actual objects are identical.

Can you check if the latest code in master works for you?

mwes commented 5 years ago

Hi, thanks for the quick turn! Your fix appears to work. I agree, your solution is more elegant. Please feel free to close this issue at your convenience. Do you think you will make a new release to pypi with this fix? Thanks again!

avian2 commented 5 years ago

I released jsonmerge 1.6.1 today that contains the fix for this issue.

mwes commented 5 years ago

Great, thanks!