rrousselGit / freezed

Code generation for immutable classes that has a simple syntax/API without compromising on the features.
https://pub.dev/packages/freezed
1.93k stars 237 forks source link

Optimize collection ==/hashCode for when they do not need a "DeepCollectionEquality" #626

Open fzyzcjy opened 2 years ago

fzyzcjy commented 2 years ago

Hi thanks for the lib! However, I am using it for large arrays(List<int>, or more specifically, Uint8List), and the following:

  const factory SomeClass.bytes(List<int> bytes) = SomeClassModeBytes;

generates:

  @override
  bool operator ==(dynamic other) {
    return identical(this, other) ||
        (other.runtimeType == runtimeType &&
            other is SomeClassModeBytes &&
            const DeepCollectionEquality().equals(other.bytes, bytes));
  }

After digging into DeepCollectionEquality, it delegates to ListEquality, and the core code for the latter is:

    for (var i = 0; i < length; i++) {
      if (!_elementEquality.equals(list1[i], list2[i])) return false;
    }

As we all know, this is terribly slow. There are tons of method calls there, and the types are dynamic so it cannot know it is a int in advance so has to do tons of work instead of a simple int equality check.

Can we customize the generated operator==? Or, can freezed be improved, such that it uses listEquals https://api.flutter.dev/flutter/foundation/listEquals.html instead? (That method is indeed about 10 lines of code, so it is even easy to copy-and-paste into freezed's library)

rrousselGit commented 2 years ago

It should be possible to optimize the generated code based on the list type, yes.

We still need to use DeepCollectionEquality for List<Object> & other cases like List<Map>, but List<int> could definitely use a tweaked implementation

In particular, ListEquality probably should be enough right?
listEquals won't do since it's flutter specific, when Freezed has no dependency on it.

fzyzcjy commented 2 years ago

In particular, ListEquality probably should be enough right?

I am afraid no. Looking at its implementation, it calls _elementEquality.equals(list1[i],list2[i]) which accept dynamic instead of int. I am not sure whether dart is strong enough to optimize this (given elementEquality is a variable)

listEquals won't do since it's flutter specific, when Freezed has no dependency on it.

Maybe just copy-and-paste its implementation (~10 lines of code) to package freezed

fzyzcjy commented 2 years ago

Here is the compiler explorer:

Example 1: only list ints (not realistic - since other code may use list of other types) https://godbolt.org/z/Tv94rGjq3

Example 2: both list ints and list dynamics https://godbolt.org/z/5MqxbsETd

(If you want to show them side by side: https://godbolt.org/z/vnMjGxaPh)

rrousselGit commented 2 years ago

I am afraid no. Looking at its implementation, it calls _elementEquality.equals(list1[i],list2[i]) which accept dynamic instead of int. I am not sure whether dart is strong enough to optimize this (given elementEquality is a variable)

But why would listEquals be any different? They have basically the same source code.

edit: don't mind me, I got confused because apparently the code the docs shows is different from the one on package:collection's master branch (https://api.flutter.dev/flutter/package-collection_collection/ListEquality/equals.html vs https://github.com/dart-lang/collection/blob/2bbb27bff8c4e48c27160160c3a2fdb9070ae303/lib/src/equality.dart#L171)

fzyzcjy commented 2 years ago

Basically, I am worried whether Dart compiler is smart enough.

  1. _elementEquality is a field of ListEquality, so I guess Dart will not inline _elementEquality.equals(a,b) into the call site. Then, inside the loop, a function call will happen which is expensive.
  2. The (default) implementation of _elementEquality is DefaultEquality, and its source code is bool equals(Object? e1, Object? e2) => e1 == e2;. Notice the type is Object? instead of the real generic type (int indeed). As we know, comparing two objects must be slower than comparing two integers - the latter should only take one CPU instruction imho.

Anyway, if the Dart compiler is very smart there may be overcomeable, but I am not sure and may not assume that without evidence.


P.S. The related code

``` bool listEquals(List? a, List? b) { if (a == null) return b == null; if (b == null || a.length != b.length) return false; if (identical(a, b)) return true; for (int index = 0; index < a.length; index += 1) { if (a[index] != b[index]) return false; } return true; } ``` and ``` abstract class Equality { const factory Equality() = DefaultEquality; bool equals(E e1, E e2); } class DefaultEquality implements Equality { const DefaultEquality(); @override bool equals(Object? e1, Object? e2) => e1 == e2; } class ListEquality implements Equality> { final Equality _elementEquality; const ListEquality( [Equality elementEquality = const DefaultEquality()]) : _elementEquality = elementEquality; @override bool equals(List? list1, List? list2) { if (identical(list1, list2)) return true; if (list1 == null || list2 == null) return false; var length = list1.length; if (length != list2.length) return false; for (var i = 0; i < length; i++) { if (!_elementEquality.equals(list1[i], list2[i])) return false; } return true; } } ```
rrousselGit commented 2 years ago

Do you maybe want to raise a PR for this? I have other things to do, but this would be a valuable change. I'd be glad to review a PR

fzyzcjy commented 2 years ago

I will do it when having time. Could you please provide some hints on where to change?

rrousselGit commented 2 years ago

It's within concrete_template.dart -> _operatorEqualMethod

knaeckeKami commented 1 year ago

To add some more context to this, it might really be a worthwhile optimization to try to avoid DeepCollectionEquality when possible.

It is ~5x slower than ListEquality() in simple cases, and has quadratic complexity when the collection is actually nested ( O(n^2) runtime, where n is the level of nesting in the collection).

So, a JSON Map<String, dynamic> with nesting level of 3 in a freezed object already takes ~60x longer to compare using == than an optimized version.

See this tracking issue: https://github.com/dart-lang/core/issues/645

and benchmark that I wrote: https://github.com/knaeckeKami/json_equals

And, as Jake Wharton puts it,

A code generator is only written once but the code it generates occurs many times. Thus, any investment into making the generator emit more efficient code will pay for itself very quickly. This generally means output less code and allocate fewer objects wherever possible. I’d like to expand on that with two specific, real-world examples which I’ve run into.

in https://jakewharton.com/the-economics-of-generated-code/