Open lrhn opened 1 year ago
Hi @lrhn,
I'm curious about your discovery that the map.keys.contains
method is much less efficient than the containsKey
method. I understand that the map.keys
method returns an Iterable
, and that the containsKey method takes advantage of the fact that the Map
knows how to quickly lookup keys.
My question is, would it be a good idea to change the implementation of the Iterable
contains method to forward to the containsKey
method? Because in any case, it is just an Iterable
, and according to what I think, the implementation for the contains method will be changed for this case if somehow overridden, would that be a good option?
I'm curious to know your thoughts on this. Thanks!
Yes, making thekeys
iterable's contains
forward to the map's containsKey
is the most likely fix here.
I understand, but wouldn't that hinder the functionality of the Iterable itself?
Again just curious
I mean as a user, when I am accessing the iterable keys, I am for sure aware of it being an iterable and not map and consider using containsKey instead if required?
A better iterator for entries
would definitely be beneficial. This should show up in our benchmarks.
I would like us to be heading towards a place where something like for (final (k, v) in map.pairs)
has no per-entry allocations.
An intermediate step might be for for (final MapEntry(:key, :value) in map.entries)
to allocate a MapEntry
in get:current
, so that could be inlined into the destructuring. That would require playing loose with identity of MapEntry
objects and making out-of-iteration calls to get:current
throw instead of returning null
in unsound mode.
map.keys.first
can be slow in some rare situations for the reasons set.first
can be slow (https://github.com/dart-lang/sdk/issues/31470), but there is no reason map.
{keys
,values
,entries
}.
{first
,last
} should be worse than set.
{first
,last
}.
Just be advised that it will be harder to fix #31470 if the tombstone scanning loop is cloned by hand into specialized iterators and first
/last
methods.
I think remove
now does cause compaction at 50% tombstones, but you can probably still keep first
being O(n/2). We could optimize and cache the first possible non-tombstone, a "usedStart" like we have "usedSize" to remember the last place we need to look in the backing array. Whenever we iterate, and the "usedStart" is a tombstone, we remember the new first element.
Heck, we can make the data array be used as a cyclic buffer, so if you keep doing set.add(set.removeFirst())
, you can stay in the same array, without needing to grow it.
@lrhn are you still considering fixing it? Does it need to be 'area-vm' issue or 'area-library' and is P2 the correct priority for it.
Let's assume I won't be the one fixing it, since I haven't done so yet
Not only is the keys.contains
inefficient, it is incorrect.
It must use Map.containsKey
, which can use a different equality (e.g, identity and custom maps).
The
Map
protocol specifies thatmap.keys.contains
should be as efficient asmap.containsKey
.The current VM implementation performs a linear lookup on
keys.contains
, using the defaultIterable
implementation ofcontains
.See: https://dartpad.dev/?id=180ce0b005f5345f9285147c222e5032 for a benchmark comparing
Map.containsKey
toMap.keys.contains
. Run in DartPad, dart2js shows results like:(with a variance bigger than the difference, so pretty much the same performance.)
The VM shows results like:
Just a factor 6000 off. For a ~10000 element linear lookup, that's in the expected ballpark.
This is fairly easy to fix, it just needs to forward to
containsKey
. I'll likely do it myself.I'll check if there are other missing efficient implementations.
last
can be wastly more efficient, but it isn't promised to be efficient to begin with. Still, some users may assume that it is efficient.entries
iterable uses the default implementation, which iterates keys, then does lookup on the key to get the value. That can be more efficient when the key/value is known to be adjacent in the data list.