Kotlin / kotlinx.collections.immutable

Immutable persistent collections for Kotlin
Apache License 2.0
1.12k stars 56 forks source link

Optimized equals() for all maps and sets #104

Closed belyaev-mikhail closed 3 years ago

belyaev-mikhail commented 3 years ago

Same as with other bulk operations, equals() can be streamlined to exploit HAMT/CHAMP internal structure for all set/map implementations. Disclaimer: this optimization only works for two sets/maps of equal size using the same implementation and works best when equals will return false, which is already kind of a stretch, but we have cases for this exact setup.

Benchmarks show good results even for equal sets/maps: https://docs.google.com/spreadsheets/d/1Y5N_-4S7yvEADMVtn82yP1bIP2wQ3mxXV3X1EL16Oes/edit?usp=sharing

Another questionable thing in this PR is support for comparing different implementations of maps (say, comparing an ordered map builder with an immutable hash map), which is mostly added because it's possible and not because it's realistic.

qurbonzoda commented 3 years ago

Another questionable thing in this PR is support for comparing different implementations of maps (say, comparing an ordered map builder with an immutable hash map), which is mostly added because it's possible and not because it's realistic.

Here we can follow stdlib definition of equals for maps.

qurbonzoda commented 3 years ago

Can we add more equals tests, especially for maps/sets of different types?

qurbonzoda commented 3 years ago

hashCode tests also needed.

belyaev-mikhail commented 3 years ago

Another questionable thing in this PR is support for comparing different implementations of maps (say, comparing an ordered map builder with an immutable hash map), which is mostly added because it's possible and not because it's realistic.

Here we can follow stdlib definition of equals for maps.

Well, we follow it by the spirit in all variants, but the question is whether it makes sense, for example, to optimize specifically for comparison between, say, ordered hashmap builder and unordered persistent hashmap. The chances this exact combination is ever needed by anyone are minimal, but it can be optimized in theory 'cos they follow the same tree structure.

qurbonzoda commented 3 years ago

Seems like hashSet to hashSet and hashSet to orderedSet comparisions are not optimized. Maybe we should create an issue to track that, though hashSet and orderedSet are not based on the same tree structure.

belyaev-mikhail commented 3 years ago

Seems like hashSet to hashSet and hashSet to orderedSet comparisions are not optimized. Maybe we should create an issue to track that, though hashSet and orderedSet are not based on the same tree structure.

This specific case can be optimized as well in theory because the principal structure of the hash-based tree is still the same, but it needs a completely separate implementation specific for this particular case only, and a higher-order function like equalsWith would not help because of different underlying node classes. Not sure if it's worth it.

qurbonzoda commented 3 years ago

Well, we follow it by the spirit in all variants, but the question is whether it makes sense

Some combinations are really rare. We can easily omit the optimization for them if the implementation will be simplified cosiderably. If I got it right, the only simplification will be removing type checks in when block, which, I think, is not a big deal.

qurbonzoda commented 3 years ago

hashSet to orderedSet comparison probably is not worth the effort for now, but we can record the issue as a known one.

It turns out hashSet to hashSet comparison is already optimized in AbstractSet.equals, which uses containsAll.

belyaev-mikhail commented 3 years ago

It turns out hashSet to hashSet comparison is already optimized in AbstractSet.equals, which uses containsAll

Yes, I had that in mind when optimizing containsAll which makes it the most important of the bulk operations in #90. Unfortunately, maps are not that lucky.

belyaev-mikhail commented 3 years ago

Rebased onto master

qurbonzoda commented 3 years ago

Now that the equals function is overridden in those classes, IDEA inspections warn that hashCode is not overridden. Maybe we can @Suppress("EqualsOrHashCode") the warning with comments? smth like "The equals function is overriden for optimization purposes and its behavior is not changed, so the inherided hashCode implementation is still valid."

belyaev-mikhail commented 3 years ago

Maybe we can @Suppress("EqualsOrHashCode") the warning with comments? smth like "The equals function is overriden for optimization purposes and its behavior is not changed, so the inherided hashCode implementation is still valid."

I see two alternatives here: either suppress and write a comment, or just add a hashCode implementation that calls super<AbstractMap>.hashCode and add a comment right there. As a personal preference I like the second approach better because in the first one the suppression and the comment will be at the top of the class, while in the second option it will be more local, but I'm not sure.

qurbonzoda commented 3 years ago

Other than the mentioned minor issues, the pull request LGTM.

qurbonzoda commented 3 years ago

Merged manually. Thank you!