scala / bug

Scala 2 bug reports only. Please, no questions — proper bug reports only.
https://scala-lang.org
232 stars 21 forks source link

Various mutable collections do not handle calling addAll, subtractAll, insertAll with themselves as an arg well #12121

Open NthPortal opened 4 years ago

NthPortal commented 4 years ago

reproduction steps

using Scala 2.13

Welcome to Scala 2.13.3 (OpenJDK 64-Bit Server VM, Java 1.8.0_265).
Type in expressions for evaluation. Or try :help.

scala> val b = scala.collection.mutable.ListBuffer(1, 2, 3, 4)
val b: scala.collection.mutable.ListBuffer[Int] = ListBuffer(1, 2, 3, 4)

scala> b ++= b
// hangs

problem

Various mutable collections use iterators over themselves or foreach to append to themselves. With iterators, this is an unsafe use of iterators and tends to produce iterators that never exhaust (because they include the new elements appended). With foreach, the call tends to never end because the collection keeps getting larger.

I have not determined the full scope of the problem; only that at least ListBuffer is included, and that the default implementation in Growable is also included.

Ichoran commented 4 years ago

(Iterator-specificity was addressed, so I removed that part.)

Though it's not obvious at first thought, b ++= b hanging, or throwing OOM, or throwing any exception at all, should be considered incorrect behavior. The reason is that you would expect b ++= b to have identical behavior to b = b ++ b given that for immutable collections the former is just sugar for the latter.

So we probably ought to check for self in every mutable collection in cases where we can't just change the algorithm to do it right. Maybe I can add a test to collections-laws once the collections are fixed. Hangs are annoying to handle in test frameworks, though. I guess I put it into a thread and kill it if it takes too long.

Anything done with indexes, for instance, can just cache the length of the target. (This makes it more vulnerable to error with concurrent modification, but you're already playing with fire there, so I think it's worth the tradeoff.)

NthPortal commented 4 years ago

I don't understand why the problem is iterator-specific.

you're right. I updated the description

sjrd commented 4 years ago

Checking for self is not a solution, because you can always hide that by passing a proxy that forwards all methods to self:

val b = ...
val c = new BufferProxy(b)
b ++= c

The latter should work if b ++= b should work.

Ichoran commented 4 years ago

@sjrd - Yes, I thought of that, but I don't see a bincompat way to get it to work. Safe-by-design approaches like caching the length are best. Otherwise you'd need a new method, e.g. isProxyOf that you'd call c.isProxyOf(b). But that's not binary compatible, and it's also inflexible. (What if you have val c = new BufferListProxy(b0, b1, b2, b3)? What about TakeRight(b, 5)?)

At some point it's up to the user not to muck stuff up.

I guess the question is whether we just define b ++= b as mucking stuff up. Write that, and you're on your own.

NthPortal commented 4 years ago

The latter should work if b ++= b should work.

I agree; however, I don't see a good way to make it work, unfortunately. Personally, I think we should still at least try to get b ++= b to work even if we can't get proxy cases to work

NthPortal commented 4 years ago

similar methods with potentially the same problem: insertAll, prependAll, appendAll (might be an alias to addAll)

NthPortal commented 4 years ago

I'm sorry. titles are hard. if github could cut out the middle ones that didn't last more than like 2 minutes, that would be nice

Edit: and I still left out prependAll from the title

NthPortal commented 4 years ago

@sjrd one possibility for adding elements to get even proxies to work would be to convert to the collection passed in to the internal representation before modifying this collection. for something like ListBuffer, this is easy, as you can link any parts of two ListBuffers together. for ArrayBuffer it's less easy, as it may involve creating additional Arrays. I think it's worth evaluating the performance cost.

it's less obvious what should be done for subtractAll. creating an intermediate collection seems unnecessarily expensive for most cases

NthPortal commented 4 years ago

also patchInPlace