API V2: rewritePathsAsURIFragments doesn't (always?) work for deleted elements

When sending a Json Patch reply to a client, we sometimes rewrite the result path as an EMF-Like path (When the client requests this format).

However, Json Paths may represent objects that no longer exist (e.g. deleted index), which can't be properly rewritten as EMF URIs (Since deleted objects no longer have a URI in EMF). In that case, we get an exception. For example, after Undoing the creation of a new element):

58844 [JettyServerThreadPool-24] ERROR org.eclipse.emfcloud.modelserver.emf.common.SingleThreadModelController [] - Execution Exception
java.util.concurrent.ExecutionException: org.eclipse.emf.common.util.BasicEList$BasicIndexOutOfBoundsException: index=4, size=4
    at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:?]
    at java.util.concurrent.FutureTask.get(FutureTask.java:205) ~[?:?]
    at org.eclipse.emfcloud.modelserver.emf.common.SingleThreadModelController.waitComplete(SingleThreadModelController.java:225) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.SingleThreadModelController.runAndWait(SingleThreadModelController.java:219) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.SingleThreadModelController.undo(SingleThreadModelController.java:175) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.ModelServerRoutingDelegate.lambda$8(ModelServerRoutingDelegate.java:151) ~[classes/:?]
    at java.util.Optional.ifPresentOrElse(Optional.java:196) ~[?:?]
    at org.eclipse.emfcloud.modelserver.emf.common.ModelURIConverter.withResolvedModelURI(ModelURIConverter.java:100) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.ModelURIConverter.withResolvedModelURI(ModelURIConverter.java:154) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.ModelServerRoutingDelegate.undoCommand(ModelServerRoutingDelegate.java:151) ~[classes/:?]
    at io.javalin.core.security.SecurityUtil.noopAccessManager(SecurityUtil.kt:20) ~[javalin-4.3.0.jar:4.3.0]
    [...]
Caused by: org.eclipse.emf.common.util.BasicEList$BasicIndexOutOfBoundsException: index=4, size=4
    at org.eclipse.emf.common.util.BasicEList.get(BasicEList.java:346) ~[org.eclipse.emf.common_2.24.0.v20220123-0838.jar:?]
    at org.eclipse.emfcloud.modelserver.common.patch.AbstractJsonPatchHelper.getValueFromList(AbstractJsonPatchHelper.java:576) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.common.patch.AbstractJsonPatchHelper.getSettingFromJsonPointer(AbstractJsonPatchHelper.java:484) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.util.JsonPatchHelper.getSetting(JsonPatchHelper.java:50) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.util.JsonPatchHelper.toURIFragmentPath(JsonPatchHelper.java:258) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.util.JsonPatchHelper.lambda$1(JsonPatchHelper.java:241) ~[classes/:?]
    at java.util.Optional.map(Optional.java:260) ~[?:?]
    at org.eclipse.emfcloud.modelserver.emf.util.JsonPatchHelper.toURIFragmentPath(JsonPatchHelper.java:240) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultSessionController.resolvePathAsURIFragment(DefaultSessionController.java:303) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultSessionController.rewritePathsAsURIFragments(DefaultSessionController.java:294) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultSessionController.lambda$4(DefaultSessionController.java:277) ~[classes/:?]
    at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) ~[?:?]
    at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:179) ~[?:?]
    at java.util.concurrent.ConcurrentHashMap$KeySpliterator.forEachRemaining(ConcurrentHashMap.java:3573) ~[?:?]
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[?:?]
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[?:?]
    at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) ~[?:?]
    at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) ~[?:?]
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
    at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596) ~[?:?]
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultSessionController.broadcastIncrementalUpdatesV2(DefaultSessionController.java:275) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultSessionController.broadcastIncrementalUpdatesV2(DefaultSessionController.java:200) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultSessionController.commandExecuted(DefaultSessionController.java:176) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultModelController.lambda$1(DefaultModelController.java:273) ~[classes/:?]
    at java.util.Optional.ifPresentOrElse(Optional.java:196) ~[?:?]
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultModelController.lambda$0(DefaultModelController.java:261) ~[classes/:?]
    at java.util.Optional.ifPresentOrElse(Optional.java:196) ~[?:?]
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultModelController.withModel(DefaultModelController.java:413) ~[classes/:?]
    at org.eclipse.emfcloud.modelserver.emf.common.DefaultModelController.undo(DefaultModelController.java:260) ~[classes/:?]
    [...]

The Json Path to rewrite:

{"op":"remove","path":"/components/4"}

Regarding rewritePathsAsURIFragments, I'm not sure how to properly handle deleted objects, where Json Paths and EMF URIs are not equivalent. Do we want to keep the original Json Path in that case? Or do we want something more complex, that would retrieve the former URI of the object that was deleted? (Which may not be possible at the time where the Json Path is being rewritten, because the object is already deleted and no longer has a URI; so we might need to derive the URI while reverting the transaction).

@cdamus WDYT?

If an object was deleted, then we shouldn't have any patch operation making changes to it. Changes to a deleted object are irrelevant; all we need to know is that it was deleted from its containing object, and that containing object is still in the resource and has a valid path and URI.

In fact, wouldn't a patch operation targeting a deleted object be inapplicable anyways and cause the patch to fail? Even with a JSON Pointer path. The object isn't there to apply the operation to it.

This isn't about patching elements that were deleted; this is about the patch that describes the changes that have been applied (Sent to subcribers)

So if you have a list of 3 elements, and delete the element at index 2, you end up with this patch:

{"op": "remove", "path":"elements/2"}

This describes the change; but when converting "elements/2" to an EMF URI after the patch has been applied, then this object no longer exists, and the conversion to an EMF URI fails (There is no object with index "2" anymore).

This is still a valid patch for the client (because it doesn't know how the model has changed just yet, so it still has an object at index 2), but not for the server (and it's also not valid for the method that rewrites Json Path to EMF URI, because it works with the updated model).

Oh, I see. Right, sorry.

Given that the client receiving the notification can be expected still to have a view of the model in the previous state (that still has the deleted object), it would be able to resolve the URI of the deleted object and, if that's what it chooses to do, apply the patch to delete it. I suppose the best solution for clients would be to track the URIs of objects that have been deleted to use them in this notification. But I'm not sure that we have a general way to do this without analyzing up-front the command or patch that causes an object to be deleted.

The client application that motivated this new subscription option in the first place would have difficult using a JSON Pointer even in this case; it really would need the URI because it tracks objects by ID assuming that they're globally unique. I expect other applications do the same.

But I'm not sure that we have a general way to do this without analyzing up-front the command or patch that causes an object to be deleted.

Yeah. In the general case, it might even be more complicated than that, for large patches that affect multiple objects: the path used by a Json patch Operation N refers to the intermediate state, resulting of the application of operation N-1. So from an empty list, you can specify e.g. [add /0, add /1, add /2, add /3] and that's a valid patch (even though none of these indices exist when creating the patch)

In that case, neither the initial model state nor the final model state if a valid referential for this patch.

We could even imagine weird cases such as:

[add /0, remove /0], which creates and immediately deletes an object, so it won't exist in the initial state nor in the final state (But in that case, the resulting patch obtained by diff'ing the initial and final model would be empty, so we probably don't have to worry about such a case).

Anyway: converting Json Path to EMF URI isn't trivial. We should at least make sure that it doesn't break. When we can't generate a valid EMF-like path, we need a generic fall back (and then the client can be responsible for fetching the entire model if it can't interpret the incremental patch, or something like that)

Yeah. In the general case, it might even be more complicated than that, for large patches that affect multiple objects: the path used by a Json patch Operation N refers to the intermediate state, resulting of the application of operation N-1. So from an empty list, you can specify e.g. [add /0, add /1, add /2, add /3] and that's a valid patch (even though none of these indices exist when creating the patch)

Right. In general, applications cannot assume that operations in a patch can be applied in arbitrary order. They are definitely an ordered sequence!

[add /0, remove /0], which creates and immediately deletes an object, so it won't exist in the initial state nor in the final state (But in that case, the resulting patch obtained by diff'ing the initial and final model would be empty, so we probably don't have to worry about such a case).

I would expect so, yes. This should have no trace in the delta notification.

In thinking about this problem more, I'm confused, again. In the original example:

{ "op": "remove", "path":"/components/4" }

a value is removed from the 4th slot in the components array of the root object (whose path is /). This object still exists in the model after the diff. In fact, every object that owns a property identified by the path of an operation in the patch (diff) still exists in the model. Otherwise, it couldn't appear in the diff. In this case, the EMF-ish URI-based path would be like

{ "op": "remove", "path": "file:/home/tester/models/resource.xmi#_0x7a57e5900d/components/4" }

where the EMF URI part of the path identifies the model root object.

A more deeply nested example makes it more clear. With a standard JSON pointer, this:

{ "op": "remove", "path":"/components/4/things/1/head" }

in which an object is removed from a single-valued feature head of a Thing inside a Component, might transform to:

{ "op": "remove", "path":"file:/home/tester/models/resource.xmi#_0x157a57y/head" }

where the EMF URI part of the path identifies a more deeply nested object than the root. But that object must still exist in the model in the "after" state because otherwise there wouldn't be operations describing changes of its properties; there would only be some operation that implies its own removal. This is only true because the diff-patch is created by a static comparison of the before and after states of the model; it has no knowledge of any intermediate states of the model that may have been described by the original edit-patch.

So, now I'm thinking that we were actually calculating the wrong "URI-ish paths" all along. That we were not properly converting only the "owner" portion of the path pointer to an EMFish URI, but sometimes took the whole path as the "URI", such as might seem to make sense for a remove operation.

In general, I don't think we can expect to see any patch computed by a static diff of the "before" and "after" states of the model manifest any of the following:

operation on an object that was subsequently deleted from the model
operation on an object that was previously added to the model
operation on any object whose position (JSON Pointer path) in the model tree is changed by insertion of some object ahead of it in the tree order

This is because the diff, being a static comparison of JSON documents, doesn't have any notion of object identity. There is no identity-based correspondence of objects in the "before" and "after" images of the model. If there is no object identity, then all JSON Pointers in the diff are in reference to objects in the "before" state and any changes that add/remove objects within the tree cascade to generate a horribly large diff covering the entire subtree following that location in the tree order. For example, the worst case is adding a new object in the first position of a list in the root object: all other objects show up in the diff.

Consider the Coffee.ecore test resource in the modelserver.emf.tests bundle. This contains an EPackage with a bunch of EClasses and EDataTypes in it. If I add a new EClass named "NewClass" with no other details at index zero of the package's eClassifiers list and do a diff of the before and after JSON documents, I get a diff of 534 lines length, attached here:

diff.json.txt

This diff literally indicates that the first EClass in the list had all of its structural features deleted and its name replaced by "NewClass" and that after this all of the other classifiers were reconfigured, replaced, removed, added by — frankly — ridiculously complex changes. As unworkable from a client perspective as this diff is, at least on the server side we can be sure that all of the path JSON pointers in it are in reference to objects that exist in the "before" state of the model, so we can use that to get the correct EMFish object-URI-plus-feature-name-and-index "paths" to solve the problem at hand.

eclipse-emfcloud / emfcloud-modelserver

API V2: rewritePathsAsURIFragments doesn't (always?) work for deleted elements #218