scala / bug

Scala 2 bug reports only. Please, no questions — proper bug reports only.
https://scala-lang.org
230 stars 21 forks source link

Nil.startsWith(Nil, -1) && Nil.startsWith(Nil, 2) #12781

Open noresttherein opened 1 year ago

noresttherein commented 1 year ago

Reproduction steps

Scala version: 2.13.10

Problem

startsWith should always return false for from < 0 or from > this.length.

som-snytt commented 1 year ago

I just left a comment on the other ticket that accepts that indexes out of range are part of the permissive collections culture, except of course for xs(i) where there must be an existing element.

I don't have a considered opinion yet, but xs.drop(from).startsWith(ys) might be the intuition (or implementation).

I might have a pre-opinion that if the from index is validated, then it should throw on out of bounds. Otherwise, startsWith(Nil) represents the empty element and every sequence starts with it trivially.

noresttherein commented 1 year ago

Of course, I agree that an empty sequence is a trivial subsequence of any sequence, but still within the span of the sequence, not outside it. I understand your reasoning, but if we agree that indexOfSlice should not return an index outside [0..length] - and I thought we did - then we have

Seq(1, 2).indexOfSlice(Nil, 3) >= 0 //false
Seq(1, 2).startsWith(Nil, 3) //true 

I think this inconsistency is much more jarring than the one with drop(from).startsWith(ys).

I am aware of the policy with the slicing methods, but there it is relatively intuitive - 'take as much as possible, but not more than this amount'. They never throw an exception. Here we are asking about a specific index, and it is hard to imagine a subsequence of a sequence lying outside that sequence. What does it for me, and why I think this is a wrong choice at least - if not a bug - is that it works only for an empty pattern sequence, as an exception to the general rule.

On the second thought, the same goes for copyToArray: it will throw an IndexOutOfBoundsException if the starting index is < 0, but only if there are any elements to write. I also found that annoying, because I feel that it either should ignore -x first elements to write if x is negative, and never throw an exception, in line with other permissive methods, or always reject negative offset. If you try to write a specification - that is, unit tests - which covers also illegal arguments, then the exact semantics are unnecessarily complex and inconsistent in my opinion. I don't know if there is any chance of that changing, though.

Which behaviour in these three cases is more intuitive is, to a degree, up for debate, but the fact remains, that the way it is now, it takes longer to explain - in terms of docs or tests - than what I expected, because it makes exceptions to a general rule only if 'stars (i.e., the combination of arguments) align'. If NOTABUG, I would love to see it change at least with the next minor version bump, or next binary incompatible changes to the collection framework.

som-snytt commented 1 year ago

Thanks for the text, I think the PR delivers what you're reasonably asking for. I haven't looked at copyToArray yet.

SethTisue commented 10 months ago

@scala/collections

Ichoran commented 10 months ago

The collections operations are generally robust to out-of-elements conditions. Additionally, when possible, I tend to use the principle that the behavior of collections operation m should match the simplest way to achieve the same thing with simpler collections operations.

Therefore, I think xs.startsWith(ys, n) should be identical to xs.drop(n).startsWith(ys). Any other behavior I would view as a bug. In particular, I don't think n >= 0 && n <= xs.length && xs.drop(n).startsWith(ys) follows the simplest-implementation principle.

The principle of least surprise can override the simplest-implementation principle if there is a strong expectation, but I at least don't have one here. Everything starts with an empty sequence, and I don't have a strong intuition that I am also asking "can I start here and if I can then does it match that?"

If the Scala library were strongly focused around index-based operations, then I'd agree, because xs.startsWith(ys, n) would be shorthand for Try( forall(0)(_ < ys.length)(_ + 1){ i => xs(i+n) == ys(i) } ).getOrElse(false). This is using a forall method that looks like the C++ for statement, which plausibly could exist in Scala.

(Note: in Scala3,

inline def forall[A](zero: A)(inline more: A => Boolean)(inline inc: A => A)(inline test: A => Boolean): Boolean =
  var i = zero
  while more(i) do
    if !test(i) then return false
    i = inc(i)
  true

and we get it. Untested, incidentally.)

But because Scala library operations are mostly not index-based, I think in this case the drop-based intuition is more consistent with the overall ethos.