jsr107 / jsr107spec

JSR107 Cache Specification
Apache License 2.0
413 stars 164 forks source link

Operation on a cache need to call ExpiryPolicy methods every time. Why?! #325

Closed cruftex closed 9 years ago

cruftex commented 9 years ago

For example the method ExpiryPolicy.getExpiryForAccess() states in the method comment:

This method is called by a caching implementation after a Cache.Entry is accessed to determine the {@link Duration} before an entry expires.

Exactly this behaviour is checked by TCK. Every call to Cache.get() on a cache must result in a call to ExpiryPolicy.getExpiryForAccess().

In a previous version the key was a parameter, so the policy could return different durations for different keys. Later, the key parameter was removed. Is the current requirement simply a mistake?

Is there any sense in calling ExpiryPolicy.getExpiryForAccess() maybe a million times per second and expecting another duration to be returned?

Is there a defined way to reach an ExpiryPolicy instance by the application, so this can be used to change the policy during an application run?

brianoliver commented 9 years ago

In a previous version the key was a parameter, so the policy could return different durations for different 'keys. Later, the key parameter was removed. Is the current requirement simply a mistake?

In the very original design the "entry" was provided to the expiry policies. This was then changed to being just the "key". However after feedback from the implementing providers the key was completely removed as in distributed / store-by-value implementations, providing the entry or key could cause every access or mutation on a Cache to unnecessarily deserialize entries/key entries. This is obviously undesirable.

Is there any sense in calling ExpiryPolicy.getExpiryForAccess() maybe a million times per second and expecting another duration to be returned?

Yes. Especially when an expiry is adjusted / changed by a management operation. Imagine an MBean that is used to control the expiry of an entry. How would such a use-case be solved without an ExpiryPolicy being consulted?

Is there a defined way to reach an ExpiryPolicy instance by the application, so this can be used to change the policy during an application run?

JCache specification doesn't attempt to define how this happens for any pluggable interface (eg: CacheLoaders/Writers and so on). In most cases it's application and/or implementation dependent.

cruftex commented 9 years ago

I am confused :(

So you say the standard needs to have these semantics to make it possible to adjust expiry values, but, OTOH there is no standard way to do it?

So we actually have two issues:

  1. The standard wants to address the use case that expiry durations can be changed during runtime, but it is unclear how to do it. If the standard wants to address it, I would expect that there is a standard way.
  2. As I read from brain comments: The TCK checks a behaviour that can only be exploited in a non standard / implementation dependent way. This means there is no check whether the behaviour can be exploited at all. This means that either we missed something, or that the respective checks can be removed without breaking anything.
brianoliver commented 9 years ago

Let's imagine an API that interfaces with a database. Let's call it JDBC.

Should the JDBC API / specification force Database implementors to store, configure, operationally control, represent internal information in a standard way? I think the answer is definitely "no".

The same thinking applies to operational characteristics of JCache. The API provides a mechanism to ask for expiry duration (cf: JDBC asking for a list of tables), but doesn't need to specify how the implementation must implement it or a developer must control it. It can't.

What the specification could do is say that the result of consecutive calls to evaluate Expiry may change, but that's probably about it.

In this case I think your confusing the notion of "what" v's "how". The intent of the specification is to say "what" it must do and not "how" it should be achieved. The specification states that we need to determine expiry, it doesn't say how it should be represented.

This approach is used in all specifications, JDBC, JMS, XA etc.

cruftex commented 9 years ago

I agree.

The standard right now does exactly what you argue against: Force specific operational behaviour on an implementation.

Why the standard forces an implementation to do something, which is not scope of the standard?

brianoliver commented 9 years ago

The standard right now does exactly what you argue against: Force specific operational behaviour on an implementation.

No. The standard provides Java Developers using the JCache specification with the most basic cache semantics, as agreed with the Experts in the Expert Group and with the Java Platform Architects.

Providing a basic level of functionality that vendors have mutually agreed to implement is not "forcing specific operational behavior on an implementation".

Furthermore, the specification doesn't specify how it should be implemented (apart from supporting Java Serialization, which was requested by the Java Platform Architects). In fact it clearly specifies that alternative (vendor specific) implementations are possible. Hazelcast, Infinispan, Coherence et al all implement the specification and additionally provide additional capabilities.

They additionally provide store-by-reference, but that's just optional.

cruftex commented 9 years ago

Thanks for the discussion! I think we are basically on the same page. Can we agree on:

  1. The standard should address a use case
  2. The standard should give implementations most possible freedom how to make the use case happen, but, on the same time, give applications clearly defined (minimum) guarantees to provide them with the intended functionality for the use case
  3. The standard should also say what is not scope of it and what is implementation specific
brianoliver commented 9 years ago

If you/someone wants to create an implementation that uses the JCache API, but doesn't pass the TCK they are more than welcome to do so. They just can't say that it's JCache or JCache Compliant in any way. They can't say that they are "JCache Caches".

If they want to create a new type of Cache Configuration, say one that doesn't support store-by-reference or store-by-value (it does anything it likes), they too are more than welcome to do so. The specifcation is designed to permit this. In fact, most vendors, correct or otherwise, have provided alternative "non-standard" configurations, but those configurations can't claim to be JCache's unless they pass the TCK.

cruftex commented 9 years ago

Yes, if I don't care about a good JCache standard.

I am opening the issue on the issue tracker, since I think the standard can be improved. Maybe there is something that can be done in 1.x maybe in 2.

That said, do you think that there is no possibility to improve here? Why not keep it open and encourage others to comment on it?

brianoliver commented 9 years ago

The issue tracker is designed to track issues, not have discussions. We have discussion forums for that ;)

If we followed your approach that every idea became and issue with an accompanied long discussion thread, our issue tracker would be filled with issues and not actual issues to resolve.

To be clear, these things create an immense amount of noise in the issue tracker, requiring extra resources for everyone to work through. Surely we can keep discussions about possible ideas in the discussion forums? Otherwise, what's the point of the discussion forums?

brianoliver commented 9 years ago

The standard should address a use case

We believe it does so. But not just "any use-case". It has to be a widely used use-case, that a group of Experts generally agree upon.

We've had a bunch of use-cases proposed, some very common, some esoteric. We had to weed out the common ones and yet at the same time provide some mechanism for the esoteric. We can't get everyone "right", but I'm sure we found a good balance.

The standard should give implementations most possible freedom how to make the use case happen, but, on the same time, give applications clearly defined (minimum) guarantees to provide them with the intended functionality for the use case.

I believe we've done this very well.

The standard should also say what is not scope of it and what is implementation specific

I believe we've done a good job here as well. We probably invested more time on this than anything else.

cruftex commented 9 years ago

The issue tracker is designed to track issues, not have discussions. We have discussion forums for that ;)

If we followed your approach that every idea became and issue with an accompanied long discussion thread, our issue tracker would be filled with issues and not actual issues to resolve.

This reads as: What is an issue or not an issue is defined by you only.

Still, this is a public issue tracker. If you want to track issues that you need to resolve and you want no discussions (=comments), why have a public issue tracker at all?

I think, we both giving us a hard time. Why? There is no need for that.

Let's clarify:

I do NOT expect you or the EG at all to resolve or even comment on an issue I put on the JSR107 issue tracker. The purpose of is it to make it visible so everybody interested in this can comment on it and come up with a resolution idea, or, of course say it is a non issue.

If nobody else has an opinion on it or seconds a concrete proposal, this means there is no interest. I am totally fine, when it is closed.

I expect that the EG welcomes contributions that are of interest of the caching community.

If nobody comments on an issue of somebody else, this can have various reasons:

I think mostly it is the first two. However, maybe we can improve here. At least I try also to dedicate some time and look at the other issues, clarify them or come up with resolution ideas.