migrator / guava-libraries

Guava: Google Core Libraries for Java 1.6+
0 stars 0 forks source link

Immutable collections do not guard against wrong Collection.toArray() method #1

Open migrator opened 9 years ago

migrator commented 9 years ago

ImmutableList etc. can be changed after construction if a collection is passed to copyOf() that keeps a reference to the array returned by toArray(). If the resulting array is of correct size and has more than one element, it is directly used as the backing array of the immutable collection. I attached a file that shows this. The same happens in other places as well, for example in Ordering.immutableSortedCopy().

Such a collection violates the contract of Collection.toArray() which explicitly mentions that the collection may not retain a reference to the array. The only way I could imagine this to happen accidentally might be someone who needs to convert a lot of data back and forth between a List and an array, and implements something like Arrays.asList() but returning the internal array directly in toArray(). Another possibility would be some third-party code that maliciously tries to mangle with internal data structures.

The fix for this would be to do a second copy in these cases, which would probably be a major performance decrease for many users. In our project we do not need this additional protection as we do not have untrusted code, and would certainly prefer the speed of the current solution instead. I guess this is true for a vast majority of users.

A variant of this fix would be to whitelist collections from the java.* and com.google.* packages which are known to be well-behaved in this regard, and do a second copy only for other collections. This would be quite ugly and still costs performance for all users that pass in their own collections.

Thus I suggest to keep the current behavior, and instead document this. Currently the wiki states that Guavas immutable collection are "Safe for use by untrusted libraries", which I would interpret for example such that it is safe to take an immutable collection returned by some untrusted library and use it directly. This is not true, regardless of whether the library directly returns an immutable collection, or whether it returns an arbitrary collection and my code calls copyOf(). Instead, users retrieving collections from untrusted sources and copying them into an immutable collection need to do a second copy themselves with ImmutableList.copyOf(collection.toArray(new T[0])). (Using toArray() would lose the element type, copying into a List would require yet another copy, and using copyOf(copyOf()) of course does not work because the second call does not copy again.)

relevance: 3

migrator commented 9 years ago

summary: Not Defined (No comment was entered for this change.)

status Not Defined creator: kak@google.com created at: Sep 24, 2014

migrator commented 9 years ago

summary: Not Defined

One thing that could be done is

Object[] array = new Object[elements.size()];
checkState(array == elements.toArray(array));
return construct(array);

instead of

return construct(elements.toArray());

I'm not really sure it's worth it just to handle invalid Collection implementations though.

status Not Defined creator: tavianator@gmail.com created at: Sep 24, 2014

migrator commented 9 years ago

summary: Not Defined

Calling toArray(array) will help with unintentional bugs. It's worth consideration, though I wonder if it has performance disadvantages for some collections.

The problem of malicious implementations is one that we've intentionally decided not to tackle. We do need to fix our docs.

status Not Defined creator: cpov...@google.com created at: Sep 24, 2014

migrator commented 9 years ago

summary: Not Defined

I believe the sense in which the wiki page meant things was that it was safe to pass ImmutableCollections to untrusted libraries, though there obviously isn't anything we can do about reflection.

status Not Defined creator: lowas...@google.com created at: Sep 24, 2014

migrator commented 9 years ago

summary: Not Defined

Passing an own array would not help against malicious implementations as they could grab the array anyway. Concerning contracts, it's even worse as I can't see anything like "no references to it are maintained by this collection" in the javadoc of Collection.toArray(T[]), though it's somehow obvious.

status Not Defined creator: Maaarti...@gmail.com created at: Sep 24, 2014

migrator commented 9 years ago

summary: Not Defined

Maaartinus, what do you mean? The Collection.toArray Javadoc states:

The returned array will be "safe" in that no references to it are maintained by this collection. (In other words, this method must allocate a new array even if this collection is backed by an array). The caller is thus free to modify the returned array.

status Not Defined creator: wasserman.louis created at: Sep 24, 2014

migrator commented 9 years ago

summary: Not Defined

I mean the proposal from comment #2 to use the other toArray method which gives no such guarantee. Doing something like

T[] toArray(T[] a) { this.gotcha = super.toArray(); return this.gotcha; } would violate common sense, but no explicitly stated contract. status Not Defined creator: Maaarti...@gmail.com created at: Sep 24, 2014