Open noresttherein opened 1 year ago
@scala/collections
That is actually consistent with the precise or precious Scaladoc.
the first index >= from such that the elements of this sequence starting at this index match the elements of sequence that, or -1 if no such subsequence exists.
The PR trivially adjusts the malicious from
. It would be nice if we had some sort of tool to verify that the collections satisfy a set of "laws" of some kind. In lieu of that, I augmented the unit test to exercise both paths (sizes known and unknown).
The PR trivially adjusts the malicious
from
. It would be nice if we had some sort of tool to verify that the collections satisfy a set of "laws" of some kind. In lieu of that, I augmented the unit test to exercise both paths (sizes known and unknown).
Will it work for Seq(1).indexOfSlice(Nil, 1)
, which currently returns -1
(but, correctly, 1
for Vector
)?
@noresttherein see the PR comment, I went the other way, but I'm not religious about it.
It would be nice if we had some sort of tool to verify that the collections satisfy a set of "laws" of some kind
we do, scala-collection-laws: https://github.com/scala/scala-collection-laws
though the fact that they aren't in-repo leads to neglect
Thanks, Seth, for not calling out my sarcasm. I have a fork and will try again to make headway with it. The sarcasm was really my choice between sarcasm and self-deprecation, much like choosing whether to allow indexOfSlice past the last index, there is a kind of arbitrary equivalence that merely shifts the burden of awareness.
I think it is still wrong:
Seq(1).indexOfSlice(Nil, 1)
should probably return 1
. It does for List
.
if (from0 > l) -1 //non strict
else if (tl < 1) from0
List is the default Seq. Did you mean Vector
currently returns 1
? My comment was that I changed that to work the same as List
, but I'm open to whatever the community consensus is. I spent a few minutes with collections-laws
, which I now refer to as claws
; whatever is decided should be encoded/enshrined there.
scala> Seq(1).indexOfSlice(Nil, 1)
val res0: Int = -1
scala> List(1).indexOfSlice(Nil, 1)
val res1: Int = -1
Edit: an argument in favor of returning the last index plus one is that it is consistent with patch
, which is permissive. Also lastIndexOfSlice
and startsWith
.
I thought indexOfSlice
should return legal indices (like indexOf
), something I can xs(i)
, but that's only true for existing elements. So probably I'll flip it to your way of thinking and see how it flies.
Welcome to Scala 2.13.10 (OpenJDK 64-Bit Server VM, Java 19).
Type in expressions for evaluation. Or try :help.
scala> val xs = (1 to 10).toList
val xs: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> xs.indexOfSlice(Nil, 10)
val res0: Int = -1
scala> xs.patch(10, List(42), 100)
val res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 42)
scala> xs.indices
val res2: scala.collection.immutable.Range = Range 0 until 10
scala> xs.lastIndexOfSlice(Nil, 10)
val res3: Int = 10
scala> xs.startsWith(Nil, 10)
val res4: Boolean = true
I expect xs.indexOfSlice(Nil, i)
to return i
for 0 <= i <= xs.size
. In particular, definitely i
and not -1
when i == xs.size
.
More generally, I guess a good "law" would be that xs.indexOfSlice(ys, from)
returns some idx >= 0
iff idx
is the smallest value such that idx >= from
and xs.slice(idx, idx + ys.size) == ys
(and -1
if no such idx
exists).
I find it confusing for an indexOfSomething
method to return an invalid positive index, but there can be arguments either way. Ultimately it's a matter of agreeing on a specification. Related to https://github.com/scala/bug/issues/12795#issuecomment-1567878466.
But for slice
, xs.size
is not an invalid index. xs.slice(xs.size, xs.size)
is perfectly valid. (and making it invalid for the sake of it is actively harmful in practice).
Another way to look at it is that indices never point to elements. Only ranges of indices point to elements. When we say apply(i)
, we are truly asking for the one element in the range [i, i+1)
. And that is invalid because i+1
is invalid when i == xs.size
. But the range [i, i)
is a valid empty range as long as 0 <= i <= xs.size
.
One could argue it's one thing for an operation to accept indices outside the defined range, if it can produce a valid result, like xs.slice(xs.size, xs.size)
(or xs.slice(xs.size + N, xs.size +N)
for that matter - is there anything special about i == xs.size
?), but it's another thing for an operation to return an index that's outside the defined range.
But I see your point, Nil.indexOfSlice(Nil)
is an example that can probably convince me, we'd have to make that reuturn -1
.
It's common practice in other languages to accept ranges that stop right at the length of the collection, but not go further. Java and C++ have very consistent APIs like that, for example.
I am persuaded by sjrd's argument about open ranges.
Because I am weak-minded, I am also persuaded by lrytz's question about what makes size + 1
different from size
.
My answer to that is:
scala> val xs = 1.to(10).toList
val xs: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> xs.indexOfSlice(Nil, 10)
val res0: Int = 10
scala> xs.indexOfSlice(Nil, 11)
val res1: Int = -1
scala> xs.lastIndexOfSlice(Nil)
val res2: Int = 10
scala> xs.slice(11, 11)
val res3: List[Int] = List()
The permissive inputs at res3 do not imply that res1 should be 11
, because res1 must be consistent with res2.
Another permissive input, xs.patch(11, List(42), 100)
, is likewise not relevant.
As a footnote, there is not a method for xs.slice(5, 6).headOption
but there is xs.lift(5)
.
I think that it is one thing is to be lax about the arguments you accept, and quite another to be lax about the values you return. At least it seemed to me like something universally accepted as the best practice (doesn't mean it's 'right', just that it seems most people would agree). On the other hand, size
as indexOfSlice(Nil)
feels ok, because it still 'borders the element': in Scala, and all C-derived languages, the 'end' or 'until' index is always exclusive: no matter how restrictive we would want to be, we have to allow it as an end of a range. And so, [size,size)
is a valid subrange of [0, size)
; [size + 1, size +1)
isn't. Anyway, indices, pointers and iterators don't really 'point' to elements, they point between them - at least that way of thinking makes sense when you do pointer arithmetic (including iterator.drop
).
The problem is, Scala is very inconsistent with it. Especially mutable vs immutable collections: all mutable methods (at least as I recollect) are not permissive, for example Buffer.remove
(and you could expect that the same logic applies to it as with slice
/drop
). Array.copyOf(a, -1)
throws a NegativeArraySizeException
. copyToArray
cannot even decide one way or the other. On the other hand, if patch
is permissive, why not updated
? Index falls out of range, so you ignore it; a valid result can be produced. It is not, because probably it felt like a bad idea (and I agree). Now, as I stated before, I am not super confident that permissive indices are a good idea in the first place (most languages I had contact with were strict, especially statically typed), or that the way they are implemented cough patch
cough is the best, but it's not a surprising decision. However, I do not report these issues simply because I disagree, but because I was truly surprised, despite knowing Scala quite well.
And I haven't even brought up yet how disturbing to me is the complete disregard towards consistency in naming of the arguments ;)
@noresttherein I think the PR follows what you say.
I can't find my comment, but somewhere I distinguished the mutating API as requiring an "existing element", i.e., a good index. That explains why a mutator treats indices differently (from sane immutable API, of course).
I also conjecture that patch
should be called patched
and patchInPlace
should be patch
, but the other question is whether they may behave differently (with respect to inputs) based on that distinction.
"I changed patch to patchInPlace and now it throws."
Reproduction steps
Scala version: 2.13.10
Expected: 0, same as in
Vector
.What's worse,
Vector(1).indexOfSlice(List(1), -2) == -2
, too (substitute anySeq
forVector
).