google / guava

Google core libraries for Java
Apache License 2.0
50.14k stars 10.89k forks source link

What should happen when serializing a view collection of an *immutable* collection/map/etc.? #1145

Open gissuebot opened 9 years ago

gissuebot commented 9 years ago

Original issue created by wes...@cutterslade.ca on 2012-09-12 at 03:26 PM


I am writing an application which exposes an interface using RMI. One of the methods of this interface returns ImmutableSet<String>. When this method is invoked I receive a NotSerializableException. After some investigation I discovered that the ImmutableSet was originating from a call to ImmutableMap.keySet() invoked on an ImmutableMap whose values are not serializable. Looking at the source for ImmutableMapKeySet, it's clear that when serialized, the entire original map is serialized as well.

This seems like it could have negative performance characteristics as well as the issue I described above. Maps are frequently made up of small keys and large values. If the values are serializable, this could easily result in far more data being transmitted than is expected or required.

Due to the restricted access on many of the classes in guava, and the reluctance to actually copy an ImmutableSet (both decisions which I understand and support), the work around for this problem is more confusing and fragile than one might expect. I have added the following methods and call fixImmutableSet() on any ImmutableSet objects returned:

  // Have to use class names because the classes are not visible.
  // These class names may change in future versions of guava without notice.
  private static final ImmutableSet<String> SAFE_IMMUTABLE_SET_NAMES = ImmutableSet
      .of("com.google.common.collect.RegularImmutableSet",
          "com.google.common.collect.SingletonImmutableSet",
          "com.google.common.collect.EmptyImmutableSet");

  private <T> ImmutableSet<T> fixImmutableSet(final ImmutableSet<T> set) {
    final ImmutableSet<T> safeSet;
    if (!SAFE_IMMUTABLE_SET_NAMES.contains(set.getClass().getName())) {
      safeSet = ImmutableSet.copyOf(set.iterator());
    }
    else {
      safeSet = set;
    }
    return safeSet;
  }
gissuebot commented 9 years ago

Original comment posted by lowasser@google.com on 2012-09-12 at 03:56 PM


From what I can tell, I'm pretty sure that

private <T> ImmutableSet<T> fixImmutableSet(final ImmutableSet<T> set) {   return ImmutableSet.copyOf(set); }

will actually work just fine here. The keySet of an ImmutableMap is marked as a "partial view," and will actually get a full copy, but the "pure" ImmutableSet implementations -- including Empty, Singleton, and Regular, as you've mentioned -- will be passed through without being copied.

gissuebot commented 9 years ago

Original comment posted by kevinb@google.com on 2012-09-12 at 10:06 PM


I have the vague feeling that I'm forgetting something, but this seems wrong to me. For a mutable collection, there's no good answer to whether serializing a view should serialize the whole backing data or not, and that's a large part of the reason why virtually none of our mutable collection views are serializable. But here? I can't think of any advantage to serializing the whole backing map. Serializing the key set ought to serialize only the keys and when you deserialize it you get a RegularImmutableSet out the other end. Am I missing something?


Status: Research Labels: Package-Collect, Type-Defect

cpovirk commented 4 months ago

We tried changing the serialized form for keySet to contain only the keys a few years back, but we ran into trouble because users were passing serialized forms between different versions of Guava. (We say not to do that, but we didn't go to the effort of intentionally breaking it with each release, so now some people do it.) There is probably a way to make everything still work, at least over a long enough time horizon, but it remains unlikely that we'll get back to it unless something changes.