Closed donraab closed 2 months ago
@mohrezaei @motlin I'm going to make a bunch of changes and submit this as allSatisfyKeyValue
and get rid of the short circuit method. I will rerun the benchmarks for the previous version and the map equals variety and post the results here so it is clearer that this change is worthwhile.
I am holding off on any and none since I loathe having large PRs without specific needs, other than symmetry. These can be added later if someone wants them.
@mohrezaei @motlin These are the benchmarks for comparison.
Original Code
Benchmark Mode Cnt Score Error Units
BagEqualsTest.bagEquals thrpt 20 19770.401 ± 139.757 ops/ms
BagEqualsTest.mapEquals thrpt 20 23120.905 ± 246.497 ops/ms
BagEqualsTest.multisetEquals thrpt 20 12867.562 ± 1087.486 ops/ms
Using AbstractHashBag
instanceof
and direct items map equals
Benchmark Mode Cnt Score Error Units
BagEqualsTest.bagEquals thrpt 20 19328.068 ± 158.372 ops/ms
BagEqualsTest.mapEquals thrpt 20 22618.070 ± 362.793 ops/ms
BagEqualsTest.multisetEquals thrpt 20 13990.621 ± 29.081 ops/ms
Using allSatisfyKeyValue
Benchmark Mode Cnt Score Error Units
BagEqualsTest.bagEquals thrpt 20 24719.542 ± 17.943 ops/ms
BagEqualsTest.mapEquals thrpt 20 23109.328 ± 235.381 ops/ms
BagEqualsTest.multisetEquals thrpt 20 14542.701 ± 66.522 ops/ms
I'll be submitting an update to the PR in the next hour with the more simplified changes.
Thank you for all the feedback and comments so far!
@mohrezaei @motlin If these changes look good and you both approve, I will test out the same changes to primitive Bags and primitive/primitive Maps. The performance there looks worse than Object Bags, but hard to tell if the benefit will be comparable or better until I try it.
Currently on IntBag:
Benchmark Mode Cnt Score Error Units
IntBagEqualsTest.bagEquals thrpt 20 16872.887 ± 46.903 ops/ms
IntBagEqualsTest.mapEquals thrpt 20 15838.773 ± 103.729 ops/ms
The Map performance is probably degraded due to boxing/unboxing effects.
Thanks @mohrezaei ! @motlin I am going to merge this since Moh has approved, and if you want to propose any changes, I can make them in the next PR for the primitive/primitive map allSatisfyKeyValue addition.
I finished the testing/benchmarking the change for primitive Bags and primitive/primitive Maps and the results are surprising.
Before changes:
Benchmark Mode Cnt Score Error Units
IntBagEqualsTest.bagEquals thrpt 20 16872.887 ± 46.903 ops/ms
IntBagEqualsTest.mapEquals thrpt 20 15838.773 ± 103.729 ops/ms
After allSatisfyKeyValue:
Benchmark Mode Cnt Score Error Units
IntBagEqualsTest.bagEquals thrpt 20 51903.988 ± 237.046 ops/ms
IntBagEqualsTest.mapEquals thrpt 20 15964.766 ± 172.302 ops/ms
I wrote a JMH Benchmark to compare the performance of the equals methods in HashBag, HashMultiset (Guava) and HashMap (Java). I was surprised to find that HashBag equals was slower than HashMap. I added a new method named shortCircuitKeyValue to Object/Primitive Maps and based the equals method in AbstractHashBag on using this method as a manual way of implementing allSatisfyKeyValue. I had originally added any/all/noneSatisfyKeyValue as well, but backed out those additions as a potential later change.
The following are the results of the JMH tests before and after changes.
Original Code
Using
AbstractHashBag
instanceof
and direct items mapequals
Using
allSatisfyKeyValue
Additionally, I refactored the value based any/all/noneSatisfy methods on the Object/Primitive Maps to use shortCircuitKeyValue. I tried using Object/Primitive Map equals to implement the HashBag equals, but found that was even slower. I updated the equals method slightly to remove some extra index lookups and multiple conversions. This sped the code up by around 10% when I measured it, but it was still slower than the shortCircuitKeyValue solution.
Update: I rolled back the
shortCircuitKeyValue
implementation and renamed toallSatisfyKeyValue
.