Feat/performance improvement

Maksimka101 commented 9 months ago

Status

READY

Breaking Changes

NO

Description

I've noticed that the performance of comparing objects with collections can be significantly enhanced. These optimizations in my application led to a considerable increase in FPS, approximately by 3 times. Additionally, I've conducted tests to compare the performance between the old and new approaches

Todos

[x] Tests
[x] Documentation
[x] Examples

Impact to Remaining Code Base

This PR will affect:

comparison logic

felangel commented 9 months ago

Thanks for the PR! Are you able to share the benchmarks you used? I would love to run them locally and validate the results.

Maksimka101 commented 9 months ago

Thanks for the PR! Are you able to share the benchmarks you used? I would love to run them locally and validate the results.

Sure, here is the link: https://github.com/Maksimka101/equatable_benchmark

Maksimka101 commented 9 months ago

Hello

Any news?🙂

Do I need to fix something or provide more information?

felangel commented 9 months ago

Hello

Any news?🙂

Do I need to fix something or provide more information?

Apologies for the delay! I was slow to respond due to the holidays but I should have time to review this in the next day or two. Thanks again!

Maksimka101 commented 9 months ago

Thanks. No need to hurry. I forgot that normal people relax on holidays🙂. Happy new year

Maksimka101 commented 6 months ago

Hello

Can we merge this?

escamoteur commented 4 months ago

Just saw this. besides that I think it would definitely great to get this merged, why not borrow the optimization of fast_equatable and cache the hashcode if that really improves performance that much :-)

felangel commented 4 months ago

Just saw this. besides that I think it would definitely great to get this merged, why not borrow the optimization of fast_equatable and cache the hashcode if that really improves performance that much :-)

Will review this later today apologies that it fell through the cracks. Looks like ci was failing but just re-triggered ci to see the logs.

felangel commented 4 months ago

Looks like some tests are missing. I’ll take a look at the benchmarks and will see if I can get this merged later today. Apologies for the delay!

felangel commented 4 months ago

@Maksimka101 I've added benchmarks and run them against the implementation on master and am seeing the following results:

branch: master

EmptyEquatable
          total runs:  2 076 295   
          total time:     2.0000  s
         average run:          0 μs
         runs/second:   Infinity
               units:        100   
        units/second:   Infinity
       time per unit:     0.0000 μs

PrimitiveEquatable
          total runs:    810 588   
          total time:     2.0000  s
         average run:          2 μs
         runs/second:    500 000   
               units:        100   
        units/second: 50 000 000   
       time per unit:     0.0200 μs

CollectionEquatable (small)
          total runs:    443 978   
          total time:     2.0000  s
         average run:          4 μs
         runs/second:    250 000   
               units:        100   
        units/second: 25 000 000   
       time per unit:     0.0400 μs

CollectionEquatable (medium)
          total runs:    442 368   
          total time:     2.0000  s
         average run:          4 μs
         runs/second:    250 000   
               units:        100   
        units/second: 25 000 000   
       time per unit:     0.0400 μs

CollectionEquatable (large)
          total runs:    450 915   
          total time:     2.0000  s
         average run:          4 μs
         runs/second:    250 000   
               units:        100   
        units/second: 25 000 000   
       time per unit:     0.0400 μs

branch: feat/performance-improvement

EmptyEquatable
          total runs:  2 069 828   
          total time:     2.0000  s
         average run:          0 μs
         runs/second:   Infinity
               units:        100   
        units/second:   Infinity
       time per unit:     0.0000 μs

PrimitiveEquatable
          total runs:    823 014   
          total time:     2.0000  s
         average run:          2 μs
         runs/second:    500 000   
               units:        100   
        units/second: 50 000 000   
       time per unit:     0.0200 μs

CollectionEquatable (small)
          total runs:    490 253   
          total time:     2.0000  s
         average run:          4 μs
         runs/second:    250 000   
               units:        100   
        units/second: 25 000 000   
       time per unit:     0.0400 μs

CollectionEquatable (medium)
          total runs:    494 469   
          total time:     2.0000  s
         average run:          4 μs
         runs/second:    250 000   
               units:        100   
        units/second: 25 000 000   
       time per unit:     0.0400 μs

CollectionEquatable (large)
          total runs:    494 548   
          total time:     2.0000  s
         average run:          4 μs
         runs/second:    250 000   
               units:        100   
        units/second: 25 000 000   
       time per unit:     0.0400 μs

MacBook Pro (M1 Pro, 16GB RAM) Dart SDK version: 3.3.4 (stable) (Tue Apr 16 19:56:12 2024 +0000) on "macos_arm64"

I'm not able to reproduce the significant performance increase you describe. Maybe DeepCollectionEquality has been optimized since you last tested?

Let me know if you're still able to reproduce the ~20% performance improvement, thanks!

felangel commented 4 months ago

Just saw this. besides that I think it would definitely great to get this merged, why not borrow the optimization of fast_equatable and cache the hashcode if that really improves performance that much :-)

Because the classes should be immutable (e.g. all fields should be final and have a const constructor). Classes that use package:fast_equatable are not immutable and cannot have a const constructor.

Maksimka101 commented 4 months ago

The thing is, our benchmarks differ. It's hard to notice, but in mine, all the collections are identical, while in yours, almost all of them differ. That's why mine gives such small values – it goes through all the fields, all the elements of the collection. Whereas in your benchmark, differences are found almost immediately, and the method completes quickly

Try to add this benchmark:

_runBenchmark(
  'CollectionEquatable (large) (all equal)',
  (index) => CollectionEquatable(
    list: List.generate(100, (i) => 1024),
    map: Map.fromEntries(
      // ignore: prefer_const_constructors
      List.generate(100, (i) => MapEntry('${1024}', 1024)),
    ),
    set: Set.from(List.generate(100, (i) => 1024)),
  ),
);

It'll give you the following results:

CollectionEquatable (large) (all equal)
          total runs:     20 718   
          total time:     2.0000  s
         average run:         96 μs
         runs/second:     10 417   
               units:        100   
        units/second:  1 041 667   
       time per unit:     0.9600 μs

In my app, one field in a huge store object was changing, and this happened very frequently. My case is not the most common, but it's also not rare. The comparison operation should be fast not only for different values but also for identical ones :)

felangel commented 4 months ago

The thing is, our benchmarks differ. It's hard to notice, but in mine, all the collections are identical, while in yours, almost all of them differ. That's why mine gives such small values – it goes through all the fields, all the elements of the collection. Whereas in your benchmark, differences are found almost immediately, and the method completes quickly

Try to add this benchmark:
_runBenchmark(
  'CollectionEquatable (large) (all equal)',
  (index) => CollectionEquatable(
    list: List.generate(100, (i) => 1024),
    map: Map.fromEntries(
      // ignore: prefer_const_constructors
      List.generate(100, (i) => MapEntry('${1024}', 1024)),
    ),
    set: Set.from(List.generate(100, (i) => 1024)),
  ),
);
It'll give you the following results:
CollectionEquatable (large) (all equal)
          total runs:     20 718   
          total time:     2.0000  s
         average run:         96 μs
         runs/second:     10 417   
               units:        100   
        units/second:  1 041 667   
       time per unit:     0.9600 μs
In my app, one field in a huge store object was changing, and this happened very frequently. My case is not the most common, but it's also not rare. The comparison operation should be fast not only for different values but also for identical ones :)

Thanks for the reply! I updated the benchmarks and am able to reproduce the slowdown in larger static datasets. Will look at it a bit more closely in the next few days. I want to take a closer look at why DeepCollectionEquality is suboptimal and ideally open a PR to improve the performance in package collection so that more packages benefit from it.

Maksimka101 commented 4 months ago

I've researched it a bit and I think the DeepCollectionEqality is slow by design due to is flexible API. It creates a map to compare sets and probably maps. Also, it doesn't use the == operator directly so it has some overhead

But hope I'm wrong :)

felangel / equatable