jsr107 / jsr107spec

JSR107 Cache Specification
Apache License 2.0
413 stars 164 forks source link

Refine cache resolution when annotations are used #341

Open jerrinot opened 8 years ago

jerrinot commented 8 years ago

Chapter 11.2.1 of the specification describes cache name resolution, but it does not say anything about type configuration.

Should a caching container/framework use a cache with type checking configured? This should be refined it the specification as otherwise it may result in portability issues when switching between containers.

cruftex commented 8 years ago

Its not totally clear or easily digestible what you are aiming at. Can you please add a code example and the possible ambiguous outcomes?

Please mind the spec on page 53:

Implementations may also perform key and value type checking at runtime for mutative Cache operations.

Type checking is not mandatory and there are no TCK tests for runtime type checks.

jerrinot commented 8 years ago

see this: https://gist.github.com/ivannov/258ce072b01d0c55dcbc9079c50458d5

jerrinot commented 8 years ago

@cruftex: the spec says type checking is not mandatory for mutative operations.

However when a cache is created with type parameters then it has to be always obtained via: getCache(String cacheName, Class<K> keyType, Class<V> valueType); and not via getCache(String cacheName) from CacheManager

Chapter 11.2.1 specified how a cache name should be generated. It implies the cache is supposed to also accessed directly (=it's not just an internal concern of the interceptor/proxy).

However the spec does not mandate whether the annotation interceptors should use a cache with type checking or a raw cache. Without knowing this you do not know what getCache() method should you call to access it. The one with type parameters or without?

cruftex commented 8 years ago

I think I got it. So it boils down to:

With annotations the usable cache typing of a requested cache is only Object,Object. There is no possible way to specify more concrete types and nothing is defined that the types are inferred from the method signatures.

Correct?

I still wonder whether it might be a hazelcast or in general a specific implementation issue.

jerrinot commented 8 years ago

@cruftex: that's correct.

I wonder about the same. There is a chance it's a bug in the CDI glue used in Payara. It's not clear to me from the spec -> the spec should be refined.

cruftex commented 8 years ago

Hmm, CacheManager.getCache says:

Implementations must check if key and value types were configured for the requested {@link Cache}. If either the keyType or valueType of the configured {@link Cache} were specified (other than Object.class) an {@link IllegalArgumentException} will be thrown.

I would say this rule only applies if the configuration of the different types is happening by means defined by JSR107, which would yield an ambiguous situation. OTOH, it is always the case that a cache implementation will not be able to store every object type (e.g. not serializable ones). There are always restrictions that might apply. So I think it is legal that an implementation has a separate type configuration inside the implementation configuration and outside the JSR107 defined scope that restricts the types and does not follow the rule above.

So, if an implementation wants to leverage more concrete type information that it gets inside its own configuration, it can do so. The TCK only checks the typing defined by the mechanisms of JSR107. What should exactly happen with the additional type information inside the implementation isn't scope of the Spec, anyways.

That said, as I can see from your gist, hazelcast refuses to return the cache since the configured types do not match (and strictly follows the rule above). Maybe hazelcast can do something more useful and still be Spec compliant? Did you open an issue at hazelcast?

OTOH, I think we both agree that the Spec should ensure a consistent and practical behavior. I just doubt that we can do this within a reasonable effort in a maintenance release.

cruftex commented 8 years ago

Specifying the type just with the class, is limited anyways, e.g. there is no way to have the complete type information for generic types just as List<String>. We had a discussion about that here: https://github.com/cache2k/cache2k/issues/11. So, at least within the cache2k API there is a way to capture the whole type information.

jerrinot commented 8 years ago

@cruftex: I'm not sure if I'm fully following you.

Hazelcast does not provide implementation of the CDI interceptors for caching annotation. Hazelcast provides an implementation of CacheManager.

Payara (app. server used in the gist I linked) implements CDI caching interceptors and use public imperative JSR107 API to talk to Hazelcast (CacheProvider, CacheManager, Cache, etc.).

As far I know the other containers (Spring, TomEE) do the same - they also use the public JSR107 API to implement the behaviour specified by the JSR107 annotations. Thanks to this design the JSR107 provider is independent on the container implementing the annotations and vice versa - you can have implementations for Guice, Spring, CDI and they can use Hazelcast, Cache2k or any other JSR107 compliant provider.

In other words: I use JCache annotations to store some data into a cache. I also want to have a full imperative access to the same cache via CacheManager. How can I tell what getCache() method should I call? The one with types? Or the one with raw access?

I also agree the type configuration options provided by the spec are somewhat clumsy, but that's a different topic.

cruftex commented 8 years ago

Okay, we have a bunch of different requirements we actually speak about. We need to structure that a little.

1) Type information on the annotated methods should be transported through to the cache implementation

Although the Spec says that type information is derived from the signature for constructing the cache name, it is not specifying that this must also be used to retrieve the cache. There is also no TCK test for this.

The cache implementation can make use of the additional type information, e.g. by doing runtime checks or selecting serializers more effective.

2) It should possible to retrieve the used Cache instance

Requirement 2 is actually addressed by the Spec, if you want to be able to retrieve the cache, you could define the behavior of the container by using a specific CacheResolverFactory. Then you can use the resolve factory yourself, too. I doubt that this will work consistently for all caching implementations and annotation implementations, since I cannot find TCK tests for this.

3) Annotation and the direct API can be mixed

You try to mix different APIs which would be requirement 3. What you probably want is requirement 2 (see the XYProblem...).

The Spec says, e.g. in the JavaDoc from CacheDefaults:

The default resolver pair will resolve the cache by name from the default {@link CacheManager}

That could be more specific and say whether it is getCache(name) or getCache(name, keyType, valueType).

But how do you get the CacheManager? That you can access the same or the default CacheManager via @Inject CacheManager cacheManager; isn't defined by the Spec as well. So it is specific behavior to the container. That could be addressed by the EE integration, which needs to be done.

That said, updating the Spec and enforcing that the type information is used, will solve requirement 1. It will probably solve your problem in your existing environment. But it will not ensure a consistent behavior across all containers.

I would suggest to handle it as two different things:

Use this issue to discuss whether the derived type information must be used (not necessarily solving your problem, as discussed above).

Add another issue to discuss whether there should be an easy way to inject/access a cache instance, e.g. by defining a @Cache annotation. Actually the current EE integration prototype tries to do this via CDI, but shouldn't this be in the JSR107 Spec?!

jerrinot commented 8 years ago

you did a great job with analysing my not-so-structured thoughts, well done!

ad 1.: you wrote: "Although the Spec says that type information is derived from the signature for constructing the cache name, it is not specifying that this must also be used to retrieve the cache."

However the JavaDoc of the getName() version w/o type parameters says:

   * @throws IllegalArgumentException if the {@link Cache} was configured with
   *                                  specific types, this method cannot be used

Do I still miss something?

ad 2: agreed

ad 3: I agree with the 1st part. "Just" amending the spec to mandate which version of getName() to use to get a cache would be a great improvement!

You also opened a new question "How to get a reference to CacheManager in a managed environment, but I see it as another topic. For example It's trivial if I use Spring.

EDIT: Nevermind, I misunderstood you in the first point. Now I understand that you are suggesting that the type information of the intercepted method may or may not be used to retrieve the cache. Is this correct?

cruftex commented 8 years ago

To be honest, I feel a little lost here. Since I do not use EE and the cache annotations myself it takes me always some time to get into the concepts again. Anyhow, I poked around a little bit more and thought how this makes some sense.

The Spec says, in the JavaDoc from CacheDefaults:

The default resolver pair will resolve the cache by name from the default {@link CacheManager}

I think that is the main conceptual statement we need to deal with. It has two questions: "What types are used?" and "What CacheManager?"

Used Types for Default Cache Resolution

There are two conceptual mismatches:

Let me further elaborate this:

The configured types in the cache for type enforcement and the types the application operates on may be different. In case the cache is configured with Obect, Object, you may use it with arbitrary types.

The effective types may be subtypes, too. If you define caching DAOs for each entity you might want to have one DAO per entity but a shared cache for every entity. So the cache would have the configured types EntityId, Entity but the DAO would have the method, for example: update(PersonId id, Person person). I think that is not uncommon, since you do not want to configure 20 caches (with resource limits) for every entity type you have. This means, deriving the correct types from the method signature is not generally useful, too.

While it is desirable to infer the types from the parameters and pass it to the cache, it will do more harm then good.

In a real usage scenario with Spring or EE container, probably createCache() with the JSR107 configuration object isn't used, but it relies on the cache specific configuration. The type enforcement in the JSR107 universe does not need to be used, a cache implementation must have its own parameters for type checking anyways if it wants to do it. In case an application requests a cache with types Object,Object, an implementation can refuse operation or it can return a cache and enforce the type checking configured by its own mechanisms. Only the latter behavior makes sense, since practically the usable types are always restricted, since only store by value is the only non optional storage semantic and a cache could only "store" an arbitrary object by its reference.

=>

For what we have right now, I think we do better with defining Object, Object as standard when annotations are used.

The only right solution would have been to consequently make the type declarable at annotation level, too. We cannot fix that in 1.x...

Used CacheManager for Default Cache Resolution

Again, the Spec says, in the JavaDoc from CacheDefaults:

The default resolver pair will resolve the cache by name from the default {@link CacheManager}

This could mean the container default, or the default as defined by the imperative API. If the meaning/interpretation is the "imperative API default", this means:

=>

It is more practical to allow a container to have parameters for its default provider, URL, etc. it uses for a specific application. This means also, that a container would have a container specific way to get a cachemanager of an application. I think that is okay. This could be addressed by an EE integration standard, to ensure consistency in the EE world.

Wrap Up

I'll just remember Mark Reinhold saying something like "We only can make things less painful".

Proposal:

Relevant JavaDoc at CacheDefaults:

The default resolver pair will resolve the cache by calling {@link CacheManager.getCache(name)} from {@link CacheManager} that is defined as default in this context in the annotation processor. This includes the option that an annotation processor can define the {@code CacheManager} returned by {@code Caching.getCachingProvider().getCacheManager()} as its default when this is feasible. Using {@link CacheManager.getCache(name)} means that the cache types at API level are {@code Object, Object} and no specific type information is used.

JavaDoc at CacheManager.getCache(name) has at the moment the relevant sentence:

Implementations must check if key and value types were configured for the requested {@link Cache}. If either the keyType or valueType of the configured {@link Cache} were specified (other than Object.class) an {@link IllegalArgumentException} will be thrown.

Add here:

Cache annotation processors will resolve caches with this method by default, since there is no defined way types can be specified at annotation level. To allow this, cache implementations must be able to work without explicit type definitions by the standard configuration mechanism.

smillidge commented 8 years ago

The RI always does

createCache(cacheName, new MutableConfiguration<Object, Object>());

See https://github.com/jsr107/RI/blob/562f40e599550ba5cd3f822d298df808d9d9a528/cache-annotations-ri/cache-annotations-ri-common/src/main/java/org/jsr107/ri/annotations/DefaultCacheResolverFactory.java#L73 that's why Payara does.

cruftex commented 8 years ago

Thanks Steve, good hint I missed to cite that code. The complete passage is:

  @Override
  public CacheResolver getCacheResolver(CacheMethodDetails<? extends Annotation> cacheMethodDetails) {
    final String cacheName = cacheMethodDetails.getCacheName();
    Cache<?, ?> cache = this.cacheManager.getCache(cacheName);

    if (cache == null) {
      logger.warning("No Cache named '" + cacheName + "' was found in the CacheManager, a default cache will be created.");
      cacheManager.createCache(cacheName, new MutableConfiguration<Object, Object>());
      cache = cacheManager.getCache(cacheName);
    }

    return new DefaultCacheResolver(cache);
  }

It does CacheManager.getCache(name) actually the CacheManager.createCache should never be executed in a real world scenario.

smillidge commented 8 years ago

The actual cache annotations however always produce a GeneratedCacheKey as the key type or something that implements the interface so I suppose the key type should be GeneratedCacheKey the value type could be difficult to determine in the general case.

cruftex commented 8 years ago

Hmm, it is hard to find arguments for the usefulness of changing to getCache(name, GeneratedCacheKey.class, Object.class). It is not a really meaningful type constraint.

We have now "Any types" or "Sepcific types". Using GeneratedCacheKey would introduce a third thing: "A little less then any types"

One interesting aspect would be, that a cache implementation could assume that every key is serializable.

I am a afraid that this will cause additional harm to the users and/or cause other troubles. Users will need to configure that type for every cache, since on JCache imperative API level the correct type must be used and known.

gregrluck commented 7 years ago

Looking into this with a real life example I have over at https://github.com/gregrluck/microprofile-conference.

cruftex commented 7 years ago

@gregrluck: Just looked at your code example. Good work, it is very important to take real scenarios. In the last commit you added the use of an annotation:

  @CacheRemove(cacheName="schedule")
  public void deleteSchedule(Long scheduleId) {
  }

Is this working?! According to the Spec the key type is always GeneratedCacheKey and the types get added to the name.

gregrluck commented 7 years ago

Haven’t got far enough to know if it is working because Payara blew up. Have an answer to that now so will let you know. Regards

Greg Luck

web: http://gregluck.com http://gregluck.com/ skype: gregrluck mobile US: +1 650 924 6244 mobile Australia: +61 408 061 622

On 5 Oct. 2016, at 12:32 pm, Jens Wilke notifications@github.com wrote:

@gregrluck https://github.com/gregrluck: Just looked at your code example. Good work, it is very important to take real scenarios. In the last commit you added the use of an annotation:

@CacheRemove(cacheName="schedule") public void deleteSchedule(Long scheduleId) { } Is this working?! According to the Spec the key type is always GeneratedCacheKey and the types get added to the name.

gregrluck commented 7 years ago

Got Payara working.

In the direct API we are using Long as the key. Annotation key types must be GeneratedCacheKey. In the example we have a LongKey which contains the Long but implemented GeneratedCacheKey.

Interoperability between the two APIs is broken. You could hack it up by using GeneratedCacheKey implementations yourself as keys but that is ugly.

I have reached out to Eric Dalquist who designed this to see what his idea was here.

To fix this we are going to have to change the Annotations API which we cannot do in 1.1 So moving this to 2.0

gregrluck commented 7 years ago

Does anyone know why we would need to keep GeneratedCacheKey? Why not make this Object?