Closed belyaev-mikhail closed 3 years ago
Another questionable thing in this PR is support for comparing different implementations of maps (say, comparing an ordered map builder with an immutable hash map), which is mostly added because it's possible and not because it's realistic.
Here we can follow stdlib definition of equals
for maps.
Can we add more equals
tests, especially for maps/sets of different types?
hashCode
tests also needed.
Another questionable thing in this PR is support for comparing different implementations of maps (say, comparing an ordered map builder with an immutable hash map), which is mostly added because it's possible and not because it's realistic.
Here we can follow stdlib definition of
equals
for maps.
Well, we follow it by the spirit in all variants, but the question is whether it makes sense, for example, to optimize specifically for comparison between, say, ordered hashmap builder and unordered persistent hashmap. The chances this exact combination is ever needed by anyone are minimal, but it can be optimized in theory 'cos they follow the same tree structure.
Seems like hashSet to hashSet and hashSet to orderedSet comparisions are not optimized. Maybe we should create an issue to track that, though hashSet and orderedSet are not based on the same tree structure.
Seems like hashSet to hashSet and hashSet to orderedSet comparisions are not optimized. Maybe we should create an issue to track that, though hashSet and orderedSet are not based on the same tree structure.
This specific case can be optimized as well in theory because the principal structure of the hash-based tree is still the same, but it needs a completely separate implementation specific for this particular case only, and a higher-order function like equalsWith
would not help because of different underlying node classes. Not sure if it's worth it.
Well, we follow it by the spirit in all variants, but the question is whether it makes sense
Some combinations are really rare. We can easily omit the optimization for them if the implementation will be simplified cosiderably. If I got it right, the only simplification will be removing type checks in when
block, which, I think, is not a big deal.
hashSet to orderedSet comparison probably is not worth the effort for now, but we can record the issue as a known one.
It turns out hashSet to hashSet comparison is already optimized in AbstractSet.equals
, which uses containsAll
.
It turns out hashSet to hashSet comparison is already optimized in
AbstractSet.equals
, which usescontainsAll
Yes, I had that in mind when optimizing containsAll
which makes it the most important of the bulk operations in #90. Unfortunately, maps are not that lucky.
Rebased onto master
Now that the equals
function is overridden in those classes, IDEA inspections warn that hashCode
is not overridden.
Maybe we can @Suppress("EqualsOrHashCode")
the warning with comments? smth like "The equals
function is overriden for optimization purposes and its behavior is not changed, so the inherided hashCode
implementation is still valid."
Maybe we can
@Suppress("EqualsOrHashCode")
the warning with comments? smth like "The equals function is overriden for optimization purposes and its behavior is not changed, so the inherided hashCode implementation is still valid."
I see two alternatives here: either suppress and write a comment, or just add a hashCode
implementation that calls super<AbstractMap>.hashCode
and add a comment right there.
As a personal preference I like the second approach better because in the first one the suppression and the comment will be at the top of the class, while in the second option it will be more local, but I'm not sure.
Other than the mentioned minor issues, the pull request LGTM.
Merged manually. Thank you!
Same as with other bulk operations, equals() can be streamlined to exploit HAMT/CHAMP internal structure for all set/map implementations. Disclaimer: this optimization only works for two sets/maps of equal size using the same implementation and works best when equals will return false, which is already kind of a stretch, but we have cases for this exact setup.
Benchmarks show good results even for equal sets/maps: https://docs.google.com/spreadsheets/d/1Y5N_-4S7yvEADMVtn82yP1bIP2wQ3mxXV3X1EL16Oes/edit?usp=sharing
Another questionable thing in this PR is support for comparing different implementations of maps (say, comparing an ordered map builder with an immutable hash map), which is mostly added because it's possible and not because it's realistic.