vitruv-tools / Vitruv

View-based Development and Model Consistency Framework
http://vitruv.tools
Eclipse Public License 1.0
14 stars 19 forks source link

Avoid duplicate resource creation in tests #330

Closed HeikoKlare closed 3 years ago

HeikoKlare commented 3 years ago

The current implementation of resource loading in tests leads to duplicate resources, because resourceSet.getResource creates a Resource even if it throws an exception. Calling createResource afterwards leads to two resources with the same URI in the ResourceSet. This is probably hidden by clearing the ResourceSet after each change propagation, such that resources are properly reloaded anyway, but still this is a bug and leads to problems when implementing a fallback propagation mechanism that does not clean the ResourceSet after each change propagation.

jGleitz commented 3 years ago

because resourceSet.getResource creates a Resource even if it throws an exception.

are you kidding me :open_mouth:? Sometimes EMF feels like it breaks sensible expectations on purpose :roll_eyes:.

jGleitz commented 3 years ago

Technically, this code is still not correct, because it does not handle a resource vanishing between the existence check and loading it (you should never check whether some shared resource exists before accessing it, but rather handle errors during access).

I guess the correct way to load an EMF resource would be (very intuitively!):

try {
    loadModelResource(modelPathWithinProject)
} catch (RuntimeException e) {
    if (e.cause instanceof FileNotFoundException) {
        unloadModelResource(modelPathWithinProject)
        createModelResource(modelPathWithinProject)
    } else throw e
}

right?

HeikoKlare commented 3 years ago

Actually, the solution that you propose is comparable to my first attempt of solving the problem. There are, however, two important things to distinguish, which are resource creation and resource loading. Creating a resource is about adding a Resource instance with a specific URI to a ResourceSet. Loading a resource is about deserializing a resource persistence from the file system. Finally, unloading is only about removing the loaded resource contents, not about removing a Resource from a ResourceSet.

When we talk about shared resources, we need to be sure which kind of shared resource we mean. You seem to talk about the files as the persistence of resources to be the shared resource (with resource having a different meaning than the EMF Resource). You would, however, need to lock the file system to avoid current modifications of that kind of resource. Additionally, these files only becomes changed by either the test (in the test ResourceSet), or by the V-SUM during change propagation. Such changes do, however, never occur at the same time. So there should be no risk of concurrent access to such a file. In the future, views should even not be persisted (at least not at the same place where the V-SUM file is placed) and then there is even no problem of concurrency between views and V-SUM. The other part of resource sharing is about concurrent access to the test ResourceSet. We can easily solve that by synchronizing the resourceAt method (or loadModelResource or whatever accesses it in a transaction) on the ResourceSet.

All in all, I do not understand the problem (or at least do not understand how to solve it), because your proposed solution does not seem to solve the problem either, as between loading, unloading and creation, changes in the file system can still be performed. In my opinion, we cannot exclude that concurrent changes at the file system can be made, except for the (implicit) protocol defined by the interaction between V-SUM and views that currently ensures it.

One further implementation option is to simply let EMF try to load the Resource (with resourceSet.getResource(..., true)), catch the exception and unload any inconsistent state in case of a failure, because this creates the resource and only fails during loading if not persistence at the resource URI exists. In my opinion this is, however, worse than the proposed solution, because you need to understand what EMF does during resource loading (and that is basically the same that we do in my proposed correction explicitly) instead of having it presented explicitly. It could look something like this (and you can even remove the first get and the create, but then have to extract the resource after the failing operation):

var resource = resourceSet.getResource(getPlatformModelUri(modelPathWithinProject), false)
if (resource === null) {
    resource = resourceSet.createResource(getPlatformModelUri(modelPathWithinProject))
    try {
        resourceSet.getResource(getPlatformModelUri(modelPathWithinProject), true)
    } catch (RuntimeException e) {
        resource.unload
    }
}

To be honest, I am not fully convinced of any of the discussed solutions yet. So, of course, I am still open for proposals.

HeikoKlare commented 3 years ago

because resourceSet.getResource creates a Resource even if it throws an exception.

are you kidding me 😮? Sometimes EMF feels like it breaks sensible expectations on purpose 🙄.

Well, I do not agree to that. If EMF would not create a resource if you set loadOnDemand to true, then you could not have expected the code you added before to work ;-) But maybe this becomes clearer with my previous comment on what resources actually are.

jGleitz commented 3 years ago

Wow, so much to untangle here. Let’s go.

we need to be sure which kind of shared resource we mean. […] You would, however, need to lock the file system to avoid current modifications of that kind of resource.

Let’s make sure to distinguish the term ‘resource’ (the English word and technical term) and Resource (this weird EMF thing). What I was talking about was a basic programming principle: Never check for the existence of a shared resource, but rather access the resource and then handle the case that it does not exist. Checking beforehand is an antipattern because you’ll need to handle the case that the resource vanishes anyway. So not checking but rather handling the error should always be prefered.

Such changes do, however, never occur at the same time. So there should be no risk of concurrent access to such a file.

That’s true for the current state of affairs. That’s why I said, ‘Technically, this code is still not correct’. It’s not an important issue, but still a good pattern to follow.

One further implementation option is to simply let EMF try to load the Resource (with resourceSet.getResource(..., true)), catch the exception and unload any inconsistent state in case of a failure, because this creates the resource and only fails during loading if not persistence at the resource URI exists.

Exactly! This is exactly what I wanted to propose with my comment. However, I see now that the createModelResource(modelPathWithinProject) in the catch clause is superfluous.

In my opinion this is, however, worse than the proposed solution, because you need to understand what EMF does during resource loading

Well yeah, if you have an unintuitive API, you are going to have unintuitive code.

If EMF would not create a resource if you set loadOnDemand to true, then you could not have expected the code you added before to work

Why? My code called createResource if loading the resource failed! I think it is fair to assume that an operation has no side effect if it fails for a predictable reason.

If I request a resource with the additional condition to load it from the file system if it does not yet exist, and the resource does not exist but can also not be loaded from the file system, I really don’t think creating an empty virtual Resource is a valid response. The thing I asked for is not possible, creating a virtual Resource is nothing I would expect. But hey, what does it matter, EMF is the way it is.


To conclude, I think this is the most sensible code:

val resourceUri = getPlatformModelUri(modelPathWithinProject)
return try {
    resourceSet.getResource(resourceUri, true)
} catch (RuntimeException e) {
    if (e.cause instanceof FileNotFoundException) {
        // EMF has created an empty resource after failing to load it from the file system
        resourceSet.getResource(resourceUri, false)
    }
}

From my understanding, we don’t even need to unload the resource. If that’s wrong, then we should also call unload in the catch clause, of course.

HeikoKlare commented 3 years ago

In short: I understand and agree to your points. Let's make the best out of the strange EMF behavior.

I have proposed a solution, which only slightly derives from our one:

  1. I have synchronized the method on the resourceSet to ensure that the whole method runs as a transactions on the ResourceSet (and because acces to a ResourceSet does not seem to be synchronized at all).
  2. I have replaced the check for the exception cause with a check of resource existence and added the creation of the resource in case if it does not exist. This is for two reasons: First, FileNotFoundException is only thrown for file URI. Using platform URIs a ResourceException is thrown, which is not accessible at all. Second, this seem to be the most safe implementation to me, because independent from the causing error and independent from the strange behavior of getResource (which does not run as a transaction, as you have pointed out), we should be able to deliver a proper Resource with that solution. I did not add an explicit unload because at least the loading seems to run as a transaction within EMF.
HeikoKlare commented 3 years ago

Since we seem to conclude and since I need these changes for properly proceeding with adaptations of the applications, I will merge them. If there are still points of criticism or discussion, feel free to add them and we can still solve them with further pull requests.