java-json-tools / json-patch

An RFC 6902 (JSON Patch) and reverse, plus RFC 7386 (JSON Merge Patch), implementation in Java using Jackson (2.2.x)
Other
638 stars 186 forks source link

patch with multiple deletes causes unexpected results or exception #11

Closed royclarkson closed 10 years ago

royclarkson commented 10 years ago

Consider the following source:

[{"name":"a"},{"name":"b"},{"name":"c"}]

And the following target:

[{"name":"b"}]

JsonDiff produces the following patch:

[{"op":"remove","path":"/0"},{"op":"remove","path":"/2"}]

However, if you apply that patch to the original source, you get an exception:

com.github.fge.jsonpatch.JsonPatchException: no such path in target JSON document
com.github.fge.jsonpatch.RemoveOperation.apply(RemoveOperation.java:58)
com.github.fge.jsonpatch.JsonPatch.apply(JsonPatch.java:155)
com.github.fge.jsonpatch.JsonPatch.apply(JsonPatch.java:145)

The issue appears to be that deletes are applied sequentially. The path at index 0 is deleted, but then the resulting array only has length 2. Operating on the shortened array means there is now no path at index 2 to delete. For example the following patch will not throw an exception, however the first and THIRD nodes are deleted, not the first two as might be expected.

[{"op":"remove","path":"/0"},{"op":"remove","path":"/1"}]
fge commented 10 years ago

Ah, OK. This is a clear bug and part of the code I understand the less :/

I have done some changes in that code from the base implementation so maybe one of my changes are the cause here; @rwatler, ping! If you know where this can come from...

rwatler commented 10 years ago

Let me take a look at this case. I'll see if I cannot get JSON Diff to generate the correct input for JSON Patch.

Randy

royclarkson commented 10 years ago

Thanks for the quick response. We've been experimenting with this library for a couple weeks. It appears to be one of the few (if only) java JSON PATCH libraries. It's great to see the work put into already. @briancavalier has been looking at some different javascript based libraries and there is some inconsistency with how multiple deletes are handled between those. He can probably provide more details about that.

rwatler commented 10 years ago

No problem. I can confirm that the original version of the JSON diff algorithm back in version 1.2 does indeed generate the correct/expected patch. However, as @fge has indicated, there has been some rather heavy refactoring of the code since. This makes sense also because we have been using one of the older versions and have not run into this problem.

Now, we need to get the 1.6 version working correctly so that we too can upgrade!

fge commented 10 years ago

@rwatler so, 1.2 is OK? Good, I'll try and bisect from there and see where things went wrong...

fge commented 10 years ago

OK, I have a problem, I can't do it right now, I need to fix a bug in one of my other projects; this bug is #2 on my priority list. Unless @rwatler finds the culprit commit before me (which is likely)!

briancavalier commented 10 years ago

@fge and @rwatler Thanks for looking into this! As @royclarkson mentioned, he and I have been working on client and server communication using JSON Patch. I've been focused on the JavaScript client side, and have definitely seen a good bit of discrepancy wrt "remove" ops across the existing JS JSON Patch libs :(

For example, our current JS implementation's diff function generates a patch similar to the one json-patch generates, ie from @royclarkson's example, altho reversed:

[{"op":"remove","path":"/2"},{"op":"remove","path":"/0"}]

Some other libs generate a patch like:

[{"op":"remove","path":"/0"},{"op":"remove","path":"/1"}]

Which, while certainly possible to apply incrementally, just seems weird :) IOW, it makes sense to a program, but not to a human (at least, not to this human!).

Previously, I had enabled support for move operations in our lib (I disabled them in the short term, because we don't need them atm), and it would generate a patch like this:

[{"op":"remove","path":"/2"},{"op":"remove","path":"/0"},{"op":"move","path":"/0","from":"/1"}]

Note the additional move. That seems like overkill, though.

I figure you've had to think through these issues as well, so I just wanted to see what your thoughts are on the various ways to represent a delete. The RFC seems to only provide very minimal guidance, saying that the patch must be able to be processed in order.

Thanks!

fge commented 10 years ago

@briancavalier the problem is mainly that JSON Patch does not define "diff"; it will only ever be an addition at best. I did a first, very, very naive implementation and @rwatler came with a "factorizing" implementation. And I have managed to botch it.

I am not sure there exists a "canonical" diff implementation to be honest; for objects, there may be, but for arrays, we of course have the problem that indices may change as we remove or add elements into it.

The current (and buggy, thanks to me) implementation for arrays uses an LCS algorithm (Longest Common Subsequence) and works from there to determine what can be factorized between moves/deletions/etc in the whole target JSON text. At first, of course, my goal will be to fix that.

But afterwards, determining a "canonical" diff representation? Not sure whether that is possible. In the example mentioned above, we can either generate patch:

[ { "op": "remove", "path": "/0" }, { "op": "remove", "path": "/1" } ]

or:

[ { "op": "remove", "path": "/2" }, { "op": "remove", "path": "/0" } ]

depending on whether we start from the end of the array or the beginning. And even then there is the case of removing elements in the middle of an array etc, which obviously has an influence on all further operations altering elements after the removed elements (index changes).

fge commented 10 years ago

OK, found the culprit commit via bisect:

755516df9ba3a7fa15c00241fca61c2e4bdbfe43

And of course, a lot has happened since then :/

Uhwell, I guess it's "revert all the way" to JsonDiff as it existed at this commit~1.

fge commented 10 years ago

OK, I have reverted to what I said, basically, and the added test works now. @royclarkson, do you have the means to test by forking/building?

Note: the generated patch for the pair mentioned in this issue is now:

[ { "op": "remove", "path": "/0" }, { "op": "remove", "path": "/1" } ]

which works as intended.

royclarkson commented 10 years ago

I can do that. I'll pull the code and test later today. Thanks!

briancavalier commented 10 years ago

the problem is mainly that JSON Patch does not define "diff"

Yeah, exactly. The definition of "valid diff" is basically "produces expected results when fed through the algorithm described in rfc6902".

Thanks for digging into this, and for the super-fast turn-around!

rwatler commented 10 years ago

@fge: do you want me to attempt a fix on your refactored version still?

fge commented 10 years ago

@rwatler if you can spare the time I'd appreciate it! Also, I believe more tests need to be written... I have trouble isolating the "elementary" scenarios to test.

rwatler commented 10 years ago

@fge: I have a fix for the refactored code. How do you want to proceed here? I can create a pull request that reestablishes the refactored code in one commit and the fix in another. Is that how you'd like to proceed?

Let me know,

Randy

fge commented 10 years ago

@rwatler I have just created a branch named refactored where I revert f176df0, can you apply the fix over this branch?

In the meanwhile I'm going to have a look at what I did wrong by looking at your diff ;)

fge commented 10 years ago

OK, thanks to @rwatler for spotting the bug in my refactoring.

@royclarkson may I ask some more work from you? There is a refactored branch which contains the fix over the new code, can you test with that as well?

In the meanwhile I'm going to add more tests...

fge commented 10 years ago

@royclarkson, @briancavalier: any news?

royclarkson commented 10 years ago

@fge, sorry for the delay. I've done some simple tests, and the changes appear to be working in the refactored branch. I put together a basic app for testing. You can see the examples in the readme. Thanks!

https://github.com/royclarkson/test-json-patch

fge commented 10 years ago

@royclarkson OK, thanks a lot! For reporting the bug in the first place and for testing.

Thanks @rwatler for noticing the bug in my refactoring and for the explanation of where I went wrong.

I'll release 1.7 immediately with this bug fix, it is important enough.

briancavalier commented 10 years ago

@fge I've done a few limited tests as well, and it's been working great so far.

fge commented 10 years ago

@briancavalier OK, good to hear :)

1.7 has just been pushed to maven; it should be available soon.

@royclarkson I have noticed one thing in your test webapp: you use a plain ObjectMapper; you should be careful with that since it will not exactly do the right job with numeric instances. By default, floating point values are deserialized as doubles. In fact, this is not much of a problem if the data is serialized with JavaScript in the first place since it can't do better than that, but if the JSON is sent as text, you may have surprises in test operations; I use DeserializationFeature.USE_BIGDECIMAL_FOR_FLOATS in JsonPatch.fromJson().

royclarkson commented 10 years ago

Ah, I see. Thanks for the warning. I'm actually not using that ObjectMapper in the test app. I added it before I realized I didn't need it. But.. internally to Spring MVC, I'm not sure how the ObjectMapper is configured for deserialization. I'll keep that in mind if we encounter any issues.

fge commented 10 years ago

@royclarkson OK, I was a little too tired. .fromJson() obviously doesn't do that. But all my mappers are configured using it.

Anyway, can you tell me when you've pulled 1.7 so that the issue can be closed?

royclarkson commented 10 years ago

I've updated to 1.7 and it's working as expected. Thanks again!