mkodekar / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

Immutable collections do not guard against wrong Collection.toArray() method #1855

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
ImmutableList etc. can be changed after construction if a collection is passed 
to copyOf() that keeps a reference to the array returned by toArray(). If the 
resulting array is of correct size and has more than one element, it is 
directly used as the backing array of the immutable collection. I attached a 
file that shows this. The same happens in other places as well, for example in 
Ordering.immutableSortedCopy().

Such a collection violates the contract of Collection.toArray() which 
explicitly mentions that the collection may not retain a reference to the 
array. The only way I could imagine this to happen accidentally might be 
someone who needs to convert a lot of data back and forth between a List and an 
array, and implements something like Arrays.asList() but returning the internal 
array directly in toArray(). Another possibility would be some third-party code 
that maliciously tries to mangle with internal data structures.

The fix for this would be to do a second copy in these cases, which would 
probably be a major performance decrease for many users. In our project we do 
not need this additional protection as we do not have untrusted code, and would 
certainly prefer the speed of the current solution instead. I guess this is 
true for a vast majority of users.

A variant of this fix would be to whitelist collections from the java.* and 
com.google.* packages which are known to be well-behaved in this regard, and do 
a second copy only for other collections. This would be quite ugly and still 
costs performance for all users that pass in their own collections.

Thus I suggest to keep the current behavior, and instead document this.
Currently the wiki states that Guavas immutable collection are "Safe for use by 
untrusted libraries", which I would interpret for example such that it is safe 
to take an immutable collection returned by some untrusted library and use it 
directly.
This is not true, regardless of whether the library directly returns an 
immutable collection, or whether it returns an arbitrary collection and my code 
calls copyOf().
Instead, users retrieving collections from untrusted sources and copying them 
into an immutable collection need to do a second copy themselves with 
ImmutableList.copyOf(collection.toArray(new T[0])). (Using toArray() would lose 
the element type, copying into a List would require yet another copy, and using 
copyOf(copyOf()) of course does not work because the second call does not copy 
again.)

Original issue reported on code.google.com by phwend...@gmail.com on 24 Sep 2014 at 6:41

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by kak@google.com on 24 Sep 2014 at 9:33

GoogleCodeExporter commented 9 years ago
One thing that could be done is

    Object[] array = new Object[elements.size()];
    checkState(array == elements.toArray(array));
    return construct(array);

instead of

    return construct(elements.toArray());

I'm not really sure it's worth it just to handle invalid Collection 
implementations though.

Original comment by tavianator@gmail.com on 24 Sep 2014 at 6:18

GoogleCodeExporter commented 9 years ago
Calling toArray(array) will help with unintentional bugs. It's worth 
consideration, though I wonder if it has performance disadvantages for some 
collections.

The problem of malicious implementations is one that we've intentionally 
decided not to tackle. We do need to fix our docs.

Original comment by cpov...@google.com on 24 Sep 2014 at 6:24

GoogleCodeExporter commented 9 years ago
I believe the sense in which the wiki page meant things was that it was safe to 
*pass* ImmutableCollections to untrusted libraries, though there obviously 
isn't anything we can do about reflection.

Original comment by lowas...@google.com on 24 Sep 2014 at 6:26

GoogleCodeExporter commented 9 years ago
Passing an own array would not help against malicious implementations as they 
could grab the array anyway. Concerning contracts, it's even worse as I can't 
see anything like "no references to it are maintained by this collection" in 
the javadoc of `Collection.toArray(T[])`, though it's somehow obvious.

Original comment by Maaarti...@gmail.com on 25 Sep 2014 at 4:29

GoogleCodeExporter commented 9 years ago
Maaartinus, what do you mean?  The Collection.toArray Javadoc states:
> The returned array will be "safe" in that no references to it are maintained 
by this collection. (In other words, this method must allocate a new array even 
if this collection is backed by an array). The caller is thus free to modify 
the returned array.

Original comment by wasserman.louis on 25 Sep 2014 at 4:33

GoogleCodeExporter commented 9 years ago
I mean the proposal from comment #2 to use *the other* toArray method which 
gives no such guarantee. Doing something like

<T> T[] toArray(T[] a) {
    this.gotcha = super.toArray();
    return this.gotcha;
}

would violate common sense, but no explicitly stated contract.

Original comment by Maaarti...@gmail.com on 25 Sep 2014 at 5:11

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<issue id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:08

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:17

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:07