Closed fanf closed 9 months ago
@fanf it looks that DefaultKeySet
was implemented like a view (not a separated collection of keys)...
I think it makes sense that keySet
references the source map because it is usually the fastest and most straight-forward implementation.. But you wouldn't expect so from documentation such as
Collects all keys of this map in an iterable collection.
Yes, it would be very appreciated to make obvious the fact that it's a view and not a new object, AND what is the best (most efficient) method to get a fresh, independant Set
.
For reference, the naive Set()++map.keySet()
we tested first does not work, most likelly because ++
is smart and see that one of the set is empty and just returns the second one.
(I knew about values
which is never what you want it to be, keySet
is a new one :)
best (most efficient) method to get a fresh, independant Set.
.keysIterator.toSet
?
@SethTisue most likely from my user point of view, but I don't know if there's better way :)
Both keySet.toSeq.toSet
and keysIterator.toSet
methods for getting a separated set are vulnerable for hash collision based DoS attacks even if the original map (like scala.collection.mutable.CollisionProofHashMap
or java.collection.LinkedHashMap
) is safe: https://github.com/scala/bug/issues/11203
@scala/collections is 2.13 also affected? (if it's 2.12-only we should close the ticket)
2.13 is similarly a view onto the enclosing Map
:
/** A generic trait that is reused by keyset implementations */
protected trait GenKeySet { this: Set[K] =>
def iterator: Iterator[K] = MapOps.this.keysIterator
def contains(key: K): Boolean = MapOps.this.contains(key)
override def size: Int = MapOps.this.size
override def knownSize: Int = MapOps.this.knownSize
override def isEmpty: Boolean = MapOps.this.isEmpty
}
The 2.13 nomenclature dictates that keySet
should be a view and keySetted
should be a copy. Good joke alert!
Agree that a tweak to the Scaladoc is deemed advisable.
Today's complaint in chat was that keys
is an Iterable
of no known properties.
Generally, keysIterator
uses the map's iterator, keySet
is a special set that uses keysIterator
, and keys
is just your keySet
.
ListMap
has a test expecting keys.map(f)
to preserve order (insertion order in this case). However, I would expect to require keys.iterator.map(f)
. (It looks like sorted maps try to preserve the sort order of their keys.)
heh, I hit this problem this week. The side effect was that I thought I'd discovered a new way to add 4 bit numbers together that used a lot less diodes and transistors than the usual encoding. Suffice to say, my excitement has been suitably squashed upon discovery that I was accidentally deduplicating before applying my cost function... le sigh.
To quote Jesus from "The Chosen" tv show, "In this world, bones will break, and hearts will break." He goes on to say that in Scala 4, matters are much improved.
Hello, we discovered with prod pain and crashes that
Map.keySet
leaks a reference to the Map object so that it is not GC'ed and leaks memory - huge memory if objects in the map are big and you wanted to only return light identifiers to them.It can be seen for ex here: https://gist.github.com/fanf/ff39065f7f9738467e40929ad7490da1
Is this known/wanted and I'm just 13 years late in that knowledge, or is it a bug (at least a bad documentation)?
The problem is likely the references to self in
DefaultKeySet
(contains
,size
...) https://github.com/scala/scala/blob/2.12.x/src/library/scala/collection/MapLike.scala#L176