scala / bug

Scala 2 bug reports only. Please, no questions — proper bug reports only.
https://scala-lang.org
232 stars 21 forks source link

SeqMap.keys != VectorMap.keys (with identical keys) for sizes 0 to 4 #12991

Open jackkoenig opened 7 months ago

jackkoenig commented 7 months ago

Reproduction steps

Shows up in Scala 2.13.13 and Scala 3.3.3 (which makes sense as this is the behavior of the standard library)

import scala.collection.immutable._

val pairs = List.tabulate(6)(i => ('a' + i).toString -> i)
for (elts <- pairs.tails.toList.reverse) {
  val vm = elts.to(VectorMap)
  val sm = elts.to(SeqMap)
  println(s"For size = ${elts.size}:  "+ (vm.keys == sm.keys))
}

This prints:

For size = 0:  false
For size = 1:  false
For size = 2:  false
For size = 3:  false
For size = 4:  false
For size = 5:  true
For size = 6:  true

(Also in Scastie form: https://scastie.scala-lang.org/5eMlHcERQsSNQ4mXent1XQ)

Problem

I would expect the .keys to be equal for SeqMap and VectorMap. Note that we do get the expected behavior for sizes > 4, presumably due to specialization of SeqMap0-4.

Ichoran commented 7 months ago

Unfortunately, this is permissible behavior by design because keys is not a Seq for SeqMap. It's just an Iterable. Iterables of different types don't have to be compatible with each other--the empty set is not equal to an empty vector, for instance. The proper fix would be to have .keys be typed as a Seq.

However, this can still be fixed anyway. It turns out that SeqMap returns Sets for .keys for the specialized small sizes. I think those could be switched to Seqs with nobody being the wiser for it.

SethTisue commented 7 months ago

@scala/collections

som-snytt commented 7 months ago

The Scaladoc was clarified in https://github.com/scala/scala/pull/10544/files

The language of the doc is designed to dissuade from assumptions about keys or keySet.

Edit: it looks like keySet compares equal in this example.

jackkoenig commented 7 months ago

Edit: it looks like keySet compares equal in this example.

If SeqMap were to follow this deprecatedOverriding suggestion on keys ("This method should be an alias for keySet")* then the behavior would be as I would expect since as you say, keySet compares equal.

Regardless, this is permissible behavior. What is the right way to say "do these two SeqMaps have the same keys in the same order?" seqMap1.keysIterator == seqMap2.keysIterator?

*https://github.com/scala/scala/blob/a5270194b6eec9678a79e2f88a83d83aa6b92de1/src/library/scala/collection/Map.scala#L219

som-snytt commented 7 months ago

@jackkoenig I see SeqMap doc says order is not relevant for equals.

For keys,

m1.keysIterator.sameElements(m2.keysIterator)

I submitted a PR to tweak keys to return a Seq just for SeqMap. Other than that, the API remains ambiguous (as to efficiency or what it returns). The keySets of sorted things are sorted, so presumably keySet.toSeq is also fine.

SethTisue commented 3 weeks ago

https://github.com/scala/scala/pull/10766 ended inconclusively. I guess we'll see if anyone else cares enough to pick it up

som-snytt commented 2 days ago

The previous comment to use sameElements is enshrined in the doc contributed in the PR.