Open astiob opened 1 year ago
@scala/collections
Of particular interest, Java’s collections of the same kind behave the way I expect (and differently from Scala’s):
scala> val set = new java.util.TreeSet(java.util.Arrays.asList(1, 2, 3, 4, 5, 6))
val set: java.util.TreeSet[Int] = [1, 2, 3, 4, 5, 6]
scala> set.subSet(3, 5).clear()
scala> set
val res1: java.util.TreeSet[Int] = [1, 2, 5, 6]
This is a good example of why it's too bad that Scaladoc on private elements is silently ignored. The doc on TreeSetProjection
says a bit more than rangeImpl
.
Showing that mutations are not confined to the range:
scala> val ss = SortedSet("any","zed","maybe")
val ss: scala.collection.mutable.SortedSet[String] = TreeSet(any, maybe, zed)
scala> val r = ss.range("m","p")
val r: scala.collection.mutable.SortedSet[String] = TreeSet(maybe)
scala> r.addOne("beta")
val res24: r.type = TreeSet(maybe)
scala> ss
val res25: scala.collection.mutable.SortedSet[String] = TreeSet(any, beta, maybe, zed)
So the range does not "clip" the operations. On that basis, it's no longer surprising to me that clear
clears the underlying collection. But it was not immediately obvious, and the public Scaladoc doesn't help. I don't understand the doc on rangeImpl
, but I'm not interested in looking at more source.
I see the other difference is that in Java you can't add to the range outside its start/end.
scala> sub.add(2)
java.lang.IllegalArgumentException: key out of range
at java.base/java.util.TreeMap$NavigableSubMap.put(TreeMap.java:1795)
at java.base/java.util.TreeSet.add(TreeSet.java:255)
... 30 elided
Java loves its APIs that are dangerous by design.
Where did my long comment go?? The one wherein I outed mapValuesInPlace
, filterInPlace
and possibly update
and remove
plus the symbolic equivalents as also problematic?
Bother, I guess the long comment vanished into the ether. Briefly:
(1) clear
isn't the only problem. Every mutating operation fails to respect boundaries; in some cases this is obviously completely wrong (e.g. clear
); in others it's merely weird.
(2) I don't think we should have an API that violates intuitive invariants like "if you put something in a map and there is no error, the map contains it". I also don't think we should be throwing exceptions left and right. So I think in the long run we need to do the same kind of thing we did with old-style Views: promise less, but make it work. For instance, omit any individual-element mutating operations and have the bulk ones only act on the range.
(3) There is no way I can see to fix this without breaking binary compatibility, but since people's code arguably ought to break if they are relying on this to be sane, maybe that's okay.
Sorry to hear about the long lost comment. I hope it's not another github trend.
I was using r("beta") = true
, for fun, but I see that the syntax suggests a lookup before an assignment, so "update out of range" is possibly too weird, as you say.
The recent tickets lodged by noresttherein have similar complaints about how input indexes are handled, especially in mutable vs immutable API. In the flavor of that thinking, mutation requires "existing" or legal indexes, but immutable clips.
I also noticed that result
returns the collection itself. I guess that's true of mutable.Set
. But I naively expected r.result
to give me a set that is no longer backed by the underlying.
In order to be free of entanglements with result
(which might just involve returning oneself), one needs a ReusableBuilder
. An ordinary Builder
can do anything it feels like as long as it obeys the type signature, because like with Iterator
, once you call result
you're not supposed to touch the original again.
Range projections seem to be an outlier in the API, or are there other similar operations? filterKeys
and mapValues
on maps were deprecated for that reason.
The private TreeSetProjection
class has some documentation
Mutations are always reflected in the original set
scala> sortedSet.range(2,4).addOne(77)
val res9: scala.collection.mutable.SortedSet[Int] = TreeSet(2, 3)
scala> sortedSet
val res10: scala.collection.mutable.SortedSet[Int] = TreeSet(1, 2, 3, 4, 5, 6, 77)
So forwarding clear
seems fine.
But other collections don't implement range
as a projection, for example mutable.BitSet
returns a new independent collection, no mutations are reflected in the original collection. That inconsistency is more problematic.
Wouldn't returning a collection.SortedMap
/collection.SortedSet
instead of mutable.SortedMap
/mutable.SortedSet
address the issue? Not binary compatible, and not even source compatible, but that's how I'd imagine it. Currently the results are typed as self type parameter C
of SortedMapOps
/SortedSetOps
, which is problematic in itself, even outside of the mutability issues, because a non strict collection and a strict collection are both conceptually quite different things and imposes unnecessary restriction on implementations. Essentially, public SortedSet
/SortedMap
subclasses need to be more marker interfaces than implementations, because they might require considerable differences in implementation. Not a huge deal, the only important thing lost is JIT inlining.
The "deck of cards" example in JLS uses list.subList(i, j).clear()
, which the Javadoc specifies as idiomatic usage.
The definition of structural modification is so suitably vague, that is, usefully:
perturb it in such a fashion that iterations in progress may yield incorrect results
that we ought to have a $perturb
macro with this wording. I am greatly perturbed that I'm unable to follow the beginner examples. (The child just began "Intro to Java", as if that weren't already a ruinous outcome.) Anyway, I remembered this ticket to report the perturbation in the force.
Reproduction steps
Scala version: 2.13.11
Problem
range
(and all its similarly-named siblings) returns a projection that reflects/contains only a bounded slice of the set/map. It is natural to assume that clearing this projection should clear all elements within these bounds and keep any other elements unaffected.A trusting programmer could use this (as I did) in an attempt to efficiently delete a whole (sub)range of values, and only testing would show that it doesn’t work.
Looking at the standard library’s source code, it seems that
RedBlackTree
doesn’t implement a range delete operation at the moment. I hear it is possible to implement range deletion in asymptotically logarithmic time, but I admit I don’t know how complicated the code would be or how big the constant factor would be.