kucci / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

EnumMap.entrySet() considered harmful: EnumMultiset.entrySet() is afffected #448

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

   static enum Foo { ONE, TWO, THREE }
   ...
   System.err.println("Oh no: " + Lists.newArrayList(
         HashMultiset.create(Arrays.asList(Foo.ONE, Foo.ONE, Foo.TWO, Foo.THREE)).entrySet()));

What is the expected output?
   Oh no: ONE x 2, TWO, THREE

What do you see instead?
   Oh no: THREE, THREE, THREE

What version of the product are you using? On what operating system?

  Guava r7, all.

Please provide any additional information below.

  This is the result of what I would consider a bug in EnumMap.entrySet().

  http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6312706

  Guava's AbstractMapBasedMultiset iterates over the Map.Entry objects from the backing map and returns a Multiset.Entry wrapper around each. However, with EnumMap's entrySet, only one Entry is ever removed. The Multiset creates N wrappers for this single entry...

  I came across this while trying to sort entries by highest count.

Workaround:

  Use a HashMultiset instead.

Original issue reported on code.google.com by ray.j.gr...@gmail.com on 11 Oct 2010 at 11:20

GoogleCodeExporter commented 9 years ago
Ooops, excuse me. The "Steps to reproduce the problem" should be using an 
EnumMultiset:

System.err.println("Oh no: " + Lists.newArrayList(
         EnumMultiset.create(Arrays.asList(Foo.ONE, Foo.ONE, Foo.TWO, Foo.THREE)).entrySet()));

Original comment by ray.j.gr...@gmail.com on 11 Oct 2010 at 11:21

GoogleCodeExporter commented 9 years ago
Gawd, I hate Maps that do this horrible trick.  There used to be a lot of them, 
but most were fixed, and I had been hoping maybe they had all been fixed.  
Sigh.  Have you done a check for any other JDK Maps that do this, by chance (so 
I don't have to do it again)?

We'll look at working around this.

Original comment by kevinb@google.com on 5 Nov 2010 at 6:00

GoogleCodeExporter commented 9 years ago
A quick look reveals that java.util.IdentityHashMap also does this.

It might be best to avoid enumerating the types of Maps that do this. It's 
possible that the backing map is a wrapper around one of these broken types and 
has its own entry Iterator but returns the underlying Entries (or in this case, 
a singleton Entry), so even checking to see if the iterator == the entry is not 
going to work in that case (EnumMap and IdentityHashMap use the entrySet 
Iterator as the Map.Entry).

So there's no good way to know whether a Map does this dirty little trick.

...I was going to suggest that AbstractMapBasedMultiset could do a quick little 
test in an EntrySet.Iterator initializer: pull off two entries and see if 
they're the same object, and if so set a flag to do a copy of the entry. But 
that doesn't work if the backing map has only one element, unless you always 
assume the worst with one entry. (If you create the Multiset.Entry as now and 
the caller saves a reference to it, it would be corrupted later if the backing 
map had more entries and was iterated again.)

The safest thing would be to copy out the key, and re-query the backing map 
every time in getCount(), which is pretty damn non-ideal. Or you could change 
the entries returned to be one-time snapshots, as is allowed by the Multiset 
specification, but that would change the behavior of classes that have already 
left @Beta and so is probably a non-starter.

Ugh.

Original comment by ray.j.gr...@gmail.com on 5 Nov 2010 at 11:10

GoogleCodeExporter commented 9 years ago
Derp. Or, just hang on to the key and value separately, not the Map.Entry:

Index: AbstractMapBasedMultiset.java
===================================================================
--- AbstractMapBasedMultiset.java   (revision 142)
+++ AbstractMapBasedMultiset.java   (working copy)
@@ -100,23 +100,25 @@
       final Iterator<Map.Entry<E, AtomicInteger>> backingEntries
           = backingMap.entrySet().iterator();
       return new Iterator<Multiset.Entry<E>>() {
-        Map.Entry<E, AtomicInteger> toRemove;
+        AtomicInteger lastReturned;

         public boolean hasNext() {
           return backingEntries.hasNext();
         }

         public Multiset.Entry<E> next() {
-          final Map.Entry<E, AtomicInteger> mapEntry = backingEntries.next();
-          toRemove = mapEntry;
+          Map.Entry<E, AtomicInteger> mapEntry = backingEntries.next();
+          final E key = mapEntry.getKey();
+          final AtomicInteger value = mapEntry.getValue();
+          lastReturned = value;
           return new Multisets.AbstractEntry<E>() {
             public E getElement() {
-              return mapEntry.getKey();
+              return key;
             }
             public int getCount() {
-              int count = mapEntry.getValue().get();
+              int count = value.get();
               if (count == 0) {
-                AtomicInteger frequency = backingMap.get(getElement());
+                AtomicInteger frequency = backingMap.get(key);
                 if (frequency != null) {
                   count = frequency.get();
                 }
@@ -127,11 +129,11 @@
         }

         public void remove() {
-          checkState(toRemove != null,
+          checkState(lastReturned != null,
               "no calls to next() since the last call to remove()");
-          size -= toRemove.getValue().getAndSet(0);
+          size -= lastReturned.getAndSet(0);
           backingEntries.remove();
-          toRemove = null;
+          lastReturned = null;
         }
       };
     }

Of course, this might screw you if your backing map has weak keys or 
something...

Original comment by ray.j.gr...@gmail.com on 11 Nov 2010 at 1:18

GoogleCodeExporter commented 9 years ago
Ray, can you verify that this is fixed as of r9?

Original comment by jim.andreou on 4 Apr 2011 at 4:44

GoogleCodeExporter commented 9 years ago
I haven't had a chance to verify, but the key and value are now held 
separately, so it should work.

In fact, it looks like you directly applied my patch, so it must work! (wink. I 
never tested my patch.)

Original comment by ray.j.gr...@gmail.com on 5 Apr 2011 at 9:44

GoogleCodeExporter commented 9 years ago
Huh? Not really. Note that we can't just apply a contributed patch without the 
author knowing about it (paperwork is involved!). In any case, the fix is 
different and hopefully more thorough: Now there is a package-private 
WellBehavedMap class, which wraps our Map instances which contains the buggy 
entrySet(), and merely reimplements the broken method. So this should also fix 
EnumBiMap, EnumHashBiMap along with EnumMultiset. But it would be nice if an 
actual user verified I didn't mess up anything. :)

Original comment by jim.andreou on 6 Apr 2011 at 2:02

GoogleCodeExporter commented 9 years ago
Durr, I was looking in my read-only svn checkout and saw my own patch.

Original comment by ray.j.gr...@gmail.com on 6 Apr 2011 at 4:46

GoogleCodeExporter commented 9 years ago
Ok, WellBehavedMap is a nice simple sure-fire solution, but it makes me a 
little sad that every Entry.getValue() is a get() on the underlying Map. What 
if one didn't need the Entries to be unique objects during iteration? But I 
suppose this is only going to be used around EnumMaps, so it'll all be fine.

Anyway, it works fine for me. Thanks!

Original comment by ray.j.gr...@gmail.com on 6 Apr 2011 at 5:33

GoogleCodeExporter commented 9 years ago
EnumMap is supposed, heck, the whole point of the type is to have a very fast 
get(), so you can stop worrying indeed. :-) If we used internally some 
IdentityHashMap (we don't), I would still be inclined to wrap it if its 
entrySet() was exposed. Thx for verifying!

Original comment by jim.andreou on 6 Apr 2011 at 3:01

GoogleCodeExporter commented 9 years ago
Just noticed a JDK7 change fixing this behavior in EnumMap and IdentityHashMap: 
http://hg.openjdk.java.net/jdk7/tl/jdk/rev/c1e87a18e46a

Original comment by cgdec...@gmail.com on 6 Apr 2011 at 5:17

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 8 Apr 2011 at 2:08

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:15

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:09