serverpod / serverpod

Serverpod is a next-generation app and web server, explicitly built for the Flutter and Dart ecosystem.
BSD 3-Clause "New" or "Revised" License
2.53k stars 237 forks source link

`deserialize` needs to be broken out into separate methods for speed #1789

Open lukehutch opened 9 months ago

lukehutch commented 9 months ago

Describe the bug

serialization.dart currently has the following code:

  /// Deserialize the provided json [data] to an object of type [t] or [T].
  T deserialize<T>(dynamic data, [Type? t]) {
    t ??= T;

    //TODO: all the "dart native" types should be listed here
    if (_isNullableType<int>(t)) {
      return data;
    } else if (_isNullableType<double>(t)) {
      return (data as num?)?.toDouble() as T;
    } else if (_isNullableType<String>(t)) {
      return data;
    } else if (_isNullableType<bool>(t)) {
      return data;
    } else if (_isNullableType<DateTime>(t)) {
      if (data is DateTime) return data as T;
      if (data == null) return null as T;
      return DateTime.tryParse(data)?.toUtc() as T;
    } else if (_isNullableType<ByteData>(t)) {
      if (data is Uint8List) {
        var byteData = ByteData.view(
          data.buffer,
          data.offsetInBytes,
          data.lengthInBytes,
        );
        return byteData as T;
      } else {
        return (data as String?)?.base64DecodedByteData() as T;
      }
    } else if (_isNullableType<Duration>(t)) {
      return data == null ? data : Duration(milliseconds: (data as int)) as T;
    } else if (_isNullableType<UuidValue>(t)) {
      return (data == null ? null : UuidValue.fromString(data as String)) as T;
    }
    throw FormatException('No deserialization found for type $t');
  }

All these type comparisons are slow, and doing it once for every field of every JSON object is a performance-killer.

The caller knows the type it expects in most cases. For example, in a compiled model class, you have code like:

  factory Profile.fromJson(
    Map<String, dynamic> jsonSerialization,
    _i1.SerializationManager serializationManager,
  ) {
    return Profile(
      id: serializationManager.deserialize<int?>(jsonSerialization['id']),
      userId:
          serializationManager.deserialize<int?>(jsonSerialization['userId']),
      name:
          serializationManager.deserialize<String?>(jsonSerialization['name']),
      gender: serializationManager
          .deserialize<_i2.Gender?>(jsonSerialization['gender']),
      dob:
          serializationManager.deserialize<DateTime?>(jsonSerialization['dob']),
      accountCreatedDate: serializationManager
          .deserialize<DateTime?>(jsonSerialization['accountCreatedDate']),
      accountStatus: serializationManager
          .deserialize<_i2.AccountStatus?>(jsonSerialization['accountStatus']),
      accountStatusChangedDate: serializationManager.deserialize<DateTime?>(
          jsonSerialization['accountStatusChangedDate']),
      datingIsEnabled: serializationManager
      // ...

Therefore, deserialize should be broken out into separate methods for each supported type (with nullable and non-nullable variants), and the caller can call the method for the type they expect: instead of

      id: serializationManager.deserialize<int?>(jsonSerialization['id'])

the fromJson method should call

      id: serializationManager.deserializeIntNullable(jsonSerialization['id'])
vlidholt commented 8 months ago

Do we know this is actually slow, or is it just a guess?

lukehutch commented 8 months ago

Typechecking is the "slowest part of Dart", yes, especially when the inheritance hierarchy has to be traversed. I haven't benchmarked this, but I would want to benchmark it both ways for a large JSON dataset, and that would require making the changes anyway.

lukehutch commented 8 months ago

Also, the large block of if-statements just screams "there has to be a better way than this" :-D ... and there is, since the caller always knows the type they are expecting.

vlidholt commented 8 months ago

We are definitely building out a some benchmarks, so we can track performance. Profiling is really the way to go though, to pinpoint what actually makes sense to spend time on. :)

lukehutch commented 8 months ago

Totally agree, although my experience with profiling in Dart (or at least in Flutter) has been terrible so far...

lukehutch commented 5 months ago

Another thing that is slow is exception handling, e.g. throwing a FormatException at the end of this method in SerializationManager:

  T deserialize<T>(dynamic data, [Type? t]) {
    t ??= T;

    //TODO: all the "dart native" types should be listed here
    if (_isNullableType<int>(t)) {
      return data;
    } else if (_isNullableType<double>(t)) {
      return (data as num?)?.toDouble() as T;
    } else if (_isNullableType<String>(t)) {
      return data;
    } else if (_isNullableType<bool>(t)) {
      return data;
    } else if (_isNullableType<DateTime>(t)) {
      if (data == null) return null as T;
      return DateTimeJsonExtension.fromJson(data) as T;
    } else if (_isNullableType<ByteData>(t)) {
      if (data == null) return null as T;
      return ByteDataJsonExtension.fromJson(data) as T;
    } else if (_isNullableType<Duration>(t)) {
      if (data == null) return null as T;
      return DurationJsonExtension.fromJson(data) as T;
    } else if (_isNullableType<UuidValue>(t)) {
      if (data == null) return null as T;
      return UuidValueJsonExtension.fromJson(data) as T;
    }
    throw FormatException('No deserialization found for type $t');
  }

Here is a benchmark:

https://gist.github.com/jposert/0cbf824ac625a6563c2f62085eda64e8

throwing an exception is 55x slower than returning a newly instantiated result object.

The need for throwing this exception also goes away with my proposal, of calling a specific deserialization method for each JSON field type.

lukehutch commented 5 months ago

What's also slow is this massive block of typechecks in protocol.dart in the client project. For my project, there are 119 of these, and that number increases daily:

    if (t == _i11.EventFilter) {
      return _i11.EventFilter.fromJson(data) as T;
    }
    if (t == _i12.EventSortOrder) {
      return _i12.EventSortOrder.fromJson(data) as T;
    }
    if (t == _i13.EventAcceptance) {
      return _i13.EventAcceptance.fromJson(data, this) as T;
    }
    if (t == _i14.EventAttendanceChange) {
      return _i14.EventAttendanceChange.fromJson(data, this) as T;
    }
    if (t == _i15.EventCostType) {
      return _i15.EventCostType.fromJson(data) as T;
    }
    if (t == _i16.EventDetails) {
      return _i16.EventDetails.fromJson(data, this) as T;
    }
    if (t == _i17.EventInfo) {
      return _i17.EventInfo.fromJson(data, this) as T;
    }
    if (t == _i18.EventQuery) {
      return _i18.EventQuery.fromJson(data, this) as T;
    }
    if (t == _i19.EventSignupType) {
      return _i19.EventSignupType.fromJson(data) as T;
    }
    if (t == _i20.EventThumbInfo) {
      return _i20.EventThumbInfo.fromJson(data, this) as T;
    }
    if (t == _i21.EventType) {
      return _i21.EventType.fromJson(data) as T;
    }
// ...

since t has to be passed in as the correct type parameter, the caller already knows what type they are expecting to deserialize, so again, there should be one method per deserializable type, for speed.