eclipse / eclipse-collections

Eclipse Collections is a collections framework for Java with optimized data structures and a rich, functional and fluent API.
http://www.eclipse.org/collections
2.39k stars 596 forks source link

Add allSatisfyKeyValue method to Object/Primitive Maps to optimize HashBag equals method. #1575

Closed donraab closed 2 months ago

donraab commented 2 months ago

I wrote a JMH Benchmark to compare the performance of the equals methods in HashBag, HashMultiset (Guava) and HashMap (Java). I was surprised to find that HashBag equals was slower than HashMap. I added a new method named shortCircuitKeyValue to Object/Primitive Maps and based the equals method in AbstractHashBag on using this method as a manual way of implementing allSatisfyKeyValue. I had originally added any/all/noneSatisfyKeyValue as well, but backed out those additions as a potential later change.

The following are the results of the JMH tests before and after changes.

Original Code

Benchmark                      Mode  Cnt      Score      Error   Units
BagEqualsTest.bagEquals       thrpt   20  19770.401 ±  139.757  ops/ms
BagEqualsTest.mapEquals       thrpt   20  23120.905 ±  246.497  ops/ms
BagEqualsTest.multisetEquals  thrpt   20  12867.562 ± 1087.486  ops/ms

Using AbstractHashBag instanceof and direct items map equals

Benchmark                      Mode  Cnt      Score     Error   Units
BagEqualsTest.bagEquals       thrpt   20  19328.068 ± 158.372  ops/ms
BagEqualsTest.mapEquals       thrpt   20  22618.070 ± 362.793  ops/ms
BagEqualsTest.multisetEquals  thrpt   20  13990.621 ±  29.081  ops/ms

Using allSatisfyKeyValue

Benchmark                      Mode  Cnt      Score     Error   Units
BagEqualsTest.bagEquals       thrpt   20  24719.542 ±  17.943  ops/ms
BagEqualsTest.mapEquals       thrpt   20  23109.328 ± 235.381  ops/ms
BagEqualsTest.multisetEquals  thrpt   20  14542.701 ±  66.522  ops/ms

Additionally, I refactored the value based any/all/noneSatisfy methods on the Object/Primitive Maps to use shortCircuitKeyValue. I tried using Object/Primitive Map equals to implement the HashBag equals, but found that was even slower. I updated the equals method slightly to remove some extra index lookups and multiple conversions. This sped the code up by around 10% when I measured it, but it was still slower than the shortCircuitKeyValue solution.

Update: I rolled back the shortCircuitKeyValue implementation and renamed to allSatisfyKeyValue.

donraab commented 2 months ago

@mohrezaei @motlin I'm going to make a bunch of changes and submit this as allSatisfyKeyValue and get rid of the short circuit method. I will rerun the benchmarks for the previous version and the map equals variety and post the results here so it is clearer that this change is worthwhile.

I am holding off on any and none since I loathe having large PRs without specific needs, other than symmetry. These can be added later if someone wants them.

donraab commented 2 months ago

@mohrezaei @motlin These are the benchmarks for comparison.

Original Code

Benchmark                      Mode  Cnt      Score      Error   Units
BagEqualsTest.bagEquals       thrpt   20  19770.401 ±  139.757  ops/ms
BagEqualsTest.mapEquals       thrpt   20  23120.905 ±  246.497  ops/ms
BagEqualsTest.multisetEquals  thrpt   20  12867.562 ± 1087.486  ops/ms

Using AbstractHashBag instanceof and direct items map equals

Benchmark                      Mode  Cnt      Score     Error   Units
BagEqualsTest.bagEquals       thrpt   20  19328.068 ± 158.372  ops/ms
BagEqualsTest.mapEquals       thrpt   20  22618.070 ± 362.793  ops/ms
BagEqualsTest.multisetEquals  thrpt   20  13990.621 ±  29.081  ops/ms

Using allSatisfyKeyValue

Benchmark                      Mode  Cnt      Score     Error   Units
BagEqualsTest.bagEquals       thrpt   20  24719.542 ±  17.943  ops/ms
BagEqualsTest.mapEquals       thrpt   20  23109.328 ± 235.381  ops/ms
BagEqualsTest.multisetEquals  thrpt   20  14542.701 ±  66.522  ops/ms

I'll be submitting an update to the PR in the next hour with the more simplified changes.

Thank you for all the feedback and comments so far!

donraab commented 2 months ago

@mohrezaei @motlin If these changes look good and you both approve, I will test out the same changes to primitive Bags and primitive/primitive Maps. The performance there looks worse than Object Bags, but hard to tell if the benefit will be comparable or better until I try it.

Currently on IntBag:

Benchmark                    Mode  Cnt      Score     Error   Units
IntBagEqualsTest.bagEquals  thrpt   20  16872.887 ±  46.903  ops/ms
IntBagEqualsTest.mapEquals  thrpt   20  15838.773 ± 103.729  ops/ms

The Map performance is probably degraded due to boxing/unboxing effects.

donraab commented 2 months ago

Thanks @mohrezaei ! @motlin I am going to merge this since Moh has approved, and if you want to propose any changes, I can make them in the next PR for the primitive/primitive map allSatisfyKeyValue addition.

I finished the testing/benchmarking the change for primitive Bags and primitive/primitive Maps and the results are surprising.

Before changes:

Benchmark                    Mode  Cnt      Score     Error   Units
IntBagEqualsTest.bagEquals  thrpt   20  16872.887 ±  46.903  ops/ms
IntBagEqualsTest.mapEquals  thrpt   20  15838.773 ± 103.729  ops/ms

After allSatisfyKeyValue:

Benchmark                    Mode  Cnt      Score     Error   Units
IntBagEqualsTest.bagEquals  thrpt   20  51903.988 ± 237.046  ops/ms
IntBagEqualsTest.mapEquals  thrpt   20  15964.766 ± 172.302  ops/ms