Open lrhn opened 1 year ago
Thanks Lasse!
The use case I had in mind when I wrote this was for built_value
models. Although built_value
classes have the option to cache hashCode, they don't by default. So I thought, if there is some model e.g.
abstract class Model {
BuiltList<Account> get accounts;
}
abstract class Account {
BuiltList<Card> cards;
}
then I thought it might be a win if there is some caching of the list hash codes. It's not that the lists are expected to be large--rather that they may contain complex objects that don't cache their hash codes.
But, this was just a guess, I didn't (and still don't) have data to back it up.
I guess a primary use case for built_collection
is in UIs, where change detection between very similar data models is probably important. But then it's actually "quickly return true" that's important, not "quickly return false". So I don't think it's particularly convincing.
I thought about calculating a more robust hash to allow quickly returning "almost certainly true" but did not actually explore that option :) in the end we would need real data to know if it's worth it.
Did you have any particular app/codebase in mind for this PR? If you have any real data that can be quite compelling against my lack of data ;)
Thanks.
No specific use, I just ended up in the method when doing a code review and wondered about the design.
It seemed like overkill to check hash codes first, and then do equals afterwards anyway, precisely because the true
path is usually the longest already.
Having cached hash codes makes it an efficient quick-check, but if the hash code is not cached, computing it is likely as expensive as doing the equals check. That can still be valuable if it's expected that the hash code will be used again anyway, but for something like adding to a hash set or map, the hash code will be computed before the ==
check, so checking whether there is a cached hash seemed like the best compromise.
The implementation of
BuiltList.operator==
usedthis.hashCode != other.hashCode
as a short-cut in operator==
. However, computinghashCode
can be as expensive as doing equality, since it has to visit all the same objects (otherwise the hash code won't be consistent with equality) and do a computation on each.This change makes the
hashCode
only be used if it's already computed by boththis
andother
, in which case it should be free.It's a change in performance-behavior. A single
==
won't do double work, but repeated==
s on the same objects might have been more efficient if the hash code was computed eagerly. For elements in a set or map, the hash code will be computed anyway, so it should not affect those. Having a list ofBuiltList
s and doing repeatedindexOf
with the same values on the list might suffer.I believe the trade-off is worth it, because most lists are not compared for equality, and those that are, are either one-off comparisons, or used as map keys or set elements. (But be free to disagree.)