Closed patrickliechty closed 9 years ago
Only one property with value JONESS was changed, yet the whole object was marked for removal and replacement.
I would expect a removal of just the one property and the add for the new property. Why is it not this granular? Is this specified in the spec? Or has the Jiff library not implemented this level of granularity?
@wwwpol Diffing arrays of objects is tricky for many reasons. One of those is that the objects can move around in the array without being changed, or they could remain in place and have their properties modified, etc. So, the diff algorithm needs a way to recognize objects in arrays that are the "same" but which may have changed properties or may have moved around.
For flexibility, jiff doesn't assume it can use a property named id
to do that, rather it allows you to provide a function. When dealing with arrays, you can provide a hash function that identifies "same" items. Then, you should see a diff with only the small change you expected.
The docs for how to specify the hash function are in the README. Please give that a try and let me know how it goes.
The Java version of Json Patch is doing this and creating small diffs without any developer intervention. Do you plan on following what they are doing?
Here is the link to their code:
https://github.com/fge/json-patch/tree/master/src/main/java/com/github/fge/jsonpatch/diff
Back to your code. In the hash function, how do I compare to see if it matches? There is only one parameter passed to the hash function. It appears you are walking the object structure. I would expect an a and b params that I can use to compare. How should I compare?
Another related question. Why is the diff generating a "test" operation?
I see that the "test" operation must be generated to allow it to be inverted. Can I turn that off?
Do you plan on following what they are doing?
The Java version relies on hasCode
which is effectively a primitive part of the Java language. JavaScript has no such primitive facility for object identity. We could, for example, impose a restriction that any object you run through jiff must have a hashCode
(or similar) method. Unfortunately, that would be impractical since it wouldn't survive a typical JSON.stringify/parse, and jiff is intended to handle plain objects and json data.
In the hash function, how do I compare to see if it matches?
The hash function is not a compare or equals function. It should return a string or number that uniquely identifies the thing that was passed to it, much like Java's hashCode
. Jiff will use this value internally to do identity comparisons.
I see that the "test" operation must be generated to allow it to be inverted. Can I turn that off?
There's no way to turn it off in the current version. Initially, we decided to err on the side of safety and allowing inversion.
I'm totally open to adding an option to disable it, though.
Thanks for the responses. Interestingly enough, if I just return false from the hash function, I get the small diff I wanted.
I would like the option to remove the test. We don't have plans to invert what we have done. Our main goal is to produce smaller http requests when a user updates their data.
I found another issue. I did a diff and got this result:
[{"op":"test","path":"/images/0/entries/0/fields/2/content","value":"Liechty"},{"op":"replace","path":"/images/0/entries/0/fields/2/content","value":"Jones"}]
Jones was the old value and Liechty is the new value. It is backwards. You should be doing the replace operation with the new value.
I added the jshash library to generate a unique hash in the hash function. It seems to work well.
I found the jshash here: http://pajhome.org.uk/crypt/md5/
Thanks for the updates @wwwpol. I'll try to hit a few of your recent points.
if I just return false from the hash function, I get the small diff I wanted.
This may seem to be doing the right thing, but if you're diffing an array of objects it's likely just working by luck. A hash function that returns the same value for everything means "every object in the array is the same object", which probably isn't what you intended. More info on hash functions below ...
I would like the option to remove the test. We don't have plans to invert what we have done
Sure thing. Seems like a good option. I opened another issue for that, see #27. I may be able to do it this weekend.
I found another issue
Jiff has quite an extensive test suite, including the official test suite provided by the JSON Patch authors. We've not seen the issue you described, but if you can provide a minimal test case that reproduces the problem, I'll be happy to dig into it.
I added the jshash library to generate a unique hash in the hash function
Cool, looks like an interesting library. Thanks for the link.
The key to jiff's hash function is ensuring that it returns the same value for objects, which in your application domain, should be considered as the same object, even if some of their properties differ. So, as a simple example, if all your objects have an integer id
field, a simple and very reliable hash function would be: function(x) { return x.id };
.
An example of a (likely) bad hash function for that scenario (objects with an id
) is JSON.stringify
, since it would return something different for two objects whose id
fields match, but whose other properties differ.
@wwwpol See #28. It adds an option to turn off the extra test ops.
Excellent, thanks
I was passing the parameters in the wrong order. That is why the values were backwards. My bad.
Thank you for your efforts on this library. It is appreciated!
Thanks, @wwwpol, and I appreciate your feedback.
I just merged #28, and I'll get a release out tonight. I feel like that, plus the hash function, addresses the bulk of concern about large diffs for now. Can we close this issue, or are there other things related to patch size you'd like to discuss?
When I get the release tonight, this will resolve this issue. Thanks again.
@wwwpol 0.7.1 just landed :)
Awesome, thanks.
Closing this, but please feel free to discuss more, or reopen if you need.
I am changing one property in a larger set of JSON and the diff is just as large as the original JSON. Here is the original JSON:
{"images":[{"id":"bdb71c44-f980-499b-abf5-5aed88a04625","entries":[{"templateProperties":{"content":"Other","contentURI":"https://beta.familysearch.org/indexing-service/template/templates/record?name=record.template.431","reviewState":"agree"},"fields":[{"label":"RELATIONSHIP_2","content":" Brother-in-Law","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_GN","content":"BOB","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_SURN","content":"JONESS","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_TITLE_TERMS","content":"","contentURI":"","reviewState":"agree","previousContent":""}]},{"templateProperties":{"content":"","contentURI":""},"fields":[]}],"header":{"fields":[{"label":"Image Type","content":"","previousContent":"","reviewState":"agree"},{"label":"Duplicate Image","content":"","previousContent":"","reviewState":"agree"},{"label":"2ndheader","content":"asdf","previousContent":"","reviewState":"agree"}]}},{"id":"80a2b121-87b6-46d6-8ea6-c5fe971943f3","entries":[{"templateProperties":{"content":"","contentURI":""},"fields":[]}],"header":{"fields":[{"label":"Image Type","content":"","previousContent":"","reviewState":"agree"},{"label":"Duplicate Image","content":"","previousContent":"","reviewState":"agree"},{"label":"2ndheader","content":"","previousContent":"","reviewState":"agree"}]}}],"userId":"28e8e455-e070-4fa1-b6f7-35098f5ae18d"}
Updated JSON:
{"images":[{"id":"bdb71c44-f980-499b-abf5-5aed88a04625","entries":[{"templateProperties":{"content":"Other","contentURI":"https://beta.familysearch.org/indexing-service/template/templates/record?name=record.template.431","reviewState":"agree"},"fields":[{"label":"RELATIONSHIP_2","content":" Brother-in-Law","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_GN","content":"BOB","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_SURN","content":"JONESS","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_TITLE_TERMS","content":"","contentURI":"","reviewState":"agree","previousContent":""}]},{"templateProperties":{"content":"","contentURI":""},"fields":[]}],"header":{"fields":[{"label":"Image Type","content":"","previousContent":"","reviewState":"agree"},{"label":"Duplicate Image","content":"","previousContent":"","reviewState":"agree"},{"label":"2ndheader","content":"asdf","previousContent":"","reviewState":"agree"}]}},{"id":"80a2b121-87b6-46d6-8ea6-c5fe971943f3","entries":[{"templateProperties":{"content":"","contentURI":""},"fields":[]}],"header":{"fields":[{"label":"Image Type","content":"","previousContent":"","reviewState":"agree"},{"label":"Duplicate Image","content":"","previousContent":"","reviewState":"agree"},{"label":"2ndheader","content":"","previousContent":"","reviewState":"agree"}]}}],"userId":"28e8e455-e070-4fa1-b6f7-35098f5ae18d"}
The DIFF:
[{"op":"add","path":"/images/0","value":{"id":"bdb71c44-f980-499b-abf5-5aed88a04625","entries":[{"templateProperties":{"content":"Other","contentURI":"https://beta.familysearch.org/indexing-service/template/templates/record?name=record.template.431","reviewState":"agree"},"fields":[{"label":"RELATIONSHIP_2","content":" Brother-in-Law","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_GN","content":"BOB","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_SURN","content":"JONES","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_TITLE_TERMS","content":"","contentURI":"","reviewState":"agree","previousContent":""}]},{"templateProperties":{"content":"","contentURI":""},"fields":[]}],"header":{"fields":[{"label":"Image Type","content":"","previousContent":"","reviewState":"agree"},{"label":"Duplicate Image","content":"","previousContent":"","reviewState":"agree"},{"label":"2ndheader","content":"asdf","previousContent":"","reviewState":"agree"}]}}},{"op":"test","path":"/images/1","value":{"id":"bdb71c44-f980-499b-abf5-5aed88a04625","entries":[{"templateProperties":{"content":"Other","contentURI":"https://beta.familysearch.org/indexing-service/template/templates/record?name=record.template.431","reviewState":"agree"},"fields":[{"label":"RELATIONSHIP_2","content":" Brother-in-Law","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_GN","content":"BOB","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_SURN","content":"JONESS","contentURI":"","reviewState":"agree","previousContent":""},{"label":"RELATIVE_TITLE_TERMS","content":"","contentURI":"","reviewState":"agree","previousContent":""}]},{"templateProperties":{"content":"","contentURI":""},"fields":[]}],"header":{"fields":[{"label":"Image Type","content":"","previousContent":"","reviewState":"agree"},{"label":"Duplicate Image","content":"","previousContent":"","reviewState":"agree"},{"label":"2ndheader","content":"asdf","previousContent":"","reviewState":"agree"}]}}},{"op":"remove","path":"/images/1"}]