jsr107 / jsr107spec

JSR107 Cache Specification
Apache License 2.0
413 stars 164 forks source link

Cache.iterator() clarifications and corrections #337

Closed cruftex closed 8 years ago

cruftex commented 8 years ago

The current JavaDoc is:

  /**
   * {@inheritDoc}
   * <p>
   * The ordering of iteration over entries is undefined.
   * <p>
   * During iteration, any entries that are a). read will have their appropriate
   * CacheEntryReadListeners notified and b). removed will have their appropriate
   * CacheEntryRemoveListeners notified.
   * <p>
   * {@link java.util.Iterator#next()} may return null if the entry is no
   * longer present, has expired or has been evicted.
   *
   * @throws IllegalStateException if the cache is {@link #isClosed()}
   */
  Iterator<Cache.Entry<K, V>> iterator();

Proposed Change / CacheEntryReadListener

There is no CacheEntryReadListener. Remove.

Proposed Change / next() may yield null

{@link java.util.Iterator#next()} may return null if the entry is no longer present, has expired or has been evicted.

This is against the iterator interface contract. If implementations really do this, all applications would be very error prone, since there are no null checks. Suggest removal of this sentence.

Implementations need to check for expiry before returning true to hasNext() and then might return an expired entry on the call to next(). That is a conceptional problem in the iterator interface. But, since there is always time between the check in the cache and the further processing in the application it does not matter.

For clarification add:

Cache implementation may not return expired entries, that expired just before the call to hasNext()

Proposed Change / Thread Safety

The Spec is silent about thread safety and the iterator. The text above on Iterator.next() actually implies that an iterator is usable while the cache is mutated, but it is not explicit and clear since its a 'may'. I propose to make this very explicit, since a non thread safe iterator on a cache is useless.

Text proposal:

An iterator instance can be used while cache mutation takes place concurrently. All entries present in the cache at the time of the call to iterator() will be iterated, but updates, removals or expiry of entries after the iterator() was called, may or may not affect the ongoing iteration. Every entry is iterated only once.

ljacomet commented 8 years ago

While working on Ehcache 3 we have had to go around the peculiarities of the iterator contract as expressed in the spec. We would certainly be in favour of changing the hasNext() and next() contracts as proposed above.

As for the thread safe precision, yes it is a good idea but I think the proposition should be limited to its first sentence. Especially the second sentence needs to be removed as it seems to indicate that all entries present at the time of iterator() call will be returned, which is incorrect just if taking into account expiry.

cruftex commented 8 years ago

@ljacomet / Louis,

Thanks for the feedback! I made a correction of the text and an alternative proposal.

cruftex commented 8 years ago

Another aspect: Is an entry iterated exactly once?

cruftex commented 8 years ago

Updated text proposal to contain: Every entry is iterated exactly once.

gregrluck commented 8 years ago

Proposed Change / CacheEntryReadListener - There is no CacheEntryReadListener. Remove. Agree

Do not agree with the other strengthening changes. Compare to CHM which says that iteration of a ConcurrentMap's EntrySet is undefined if mutations to the underlying map are occurring.

cruftex commented 8 years ago

I agree to close, too. Making to strong guarantees what is going on while concurrent updates is happening, may be a problem for some implementors, e.g. "What is happening during iteration if the topology change s". This may cause a huge engineering effort but the usefulness is unclear.

Will open another issue for the null.