google / protobuf.dart

Runtime library for Dart protobufs
https://pub.dev/packages/protobuf
BSD 3-Clause "New" or "Revised" License
533 stars 185 forks source link

Merging maps do not merge map values #596

Open osa1 opened 2 years ago

osa1 commented 2 years ago

When merging an encoded message, for a repeated field, we extend the existing list with the new elements in the new message: https://github.com/google/protobuf.dart/blob/9da84aef9d8126ac3da665f3c163e3cb4af31b6d/protobuf/lib/src/protobuf/proto3_json.dart#L382-L388

However, when merging a map field, if both old and new messages have the same key, we overwrite the old value with the new value, instead of merging old and new values of the key: https://github.com/google/protobuf.dart/blob/9da84aef9d8126ac3da665f3c163e3cb4af31b6d/protobuf/lib/src/protobuf/proto3_json.dart#L369-L371

So if I have a: [1] in the old message and a: [2] in the new, after merging with _mergeFromProto3Json, the final message field will be a: [2] instead of a: [1, 2].

As far as I understand "merging" is not part of the proto3 spec, but from a user's point of view this is inconsistent.

Note: same bug exists in all encodings (custom JSON, proto3 JSON, binary).

osa1 commented 2 years ago

It turns out map values cannot be repeated, but I managed to reproduce the issue using message values. Proto file:

syntax = "proto3";

message A {
  map<int32, Ints> a = 1;
}

message Ints {
    repeated int32 ints = 1;
}

Test:

import '../test.pb.dart';

void main() {
  // Merging `Ints` messages appends "repeated" fields:
  {
    var ints1 = Ints.create();
    ints1.ints.add(1);
    ints1.ints.add(2);

    var ints2 = Ints.create();
    ints2.ints.add(3);
    ints2.ints.add(4);

    ints1.mergeFromProto3Json(ints2.toProto3Json());
    print(ints1); // prints 1, 2, 3, 4
  }

  // When `Ints` used as a map value, merging overwrites the old `Ints` with
  // the new one, instead of merging `Ints` messages:
  var msg1 = A.create();

  {
    var ints_msg = Ints.create();
    ints_msg.ints.add(1);
    ints_msg.ints.add(2);
    msg1.a[1] = ints_msg;
  }

  var msg2 = A.create();

  {
    var ints_msg = Ints.create();
    ints_msg.ints.add(3);
    ints_msg.ints.add(4);
    msg2.a[1] = ints_msg;
  }

  msg1.mergeFromProto3Json(msg2.toProto3Json());
  print(msg1); // prints 3, 4
}
osa1 commented 2 years ago

I think this may be working as intended.

In the older versions of this library, map fields were implemented as a repeated fields of some entry type. For example, if I had:

message Msg {
    map<int32, int32> my_map = 1;
}

The field would be represented as List<MsgMyMapEntry> where MsgMyMapEntry is message MsgMyMapEntry { int32 key = 1; int32 value = 2; }.

In that implementation, merging two map fields would be list append (because that's how repeated fields are merged), and when the resulting field is used as a map this merge implementation effectively overrides the old value of a key with the new value of the key in the second map, or the other way around depending on how lookup is implemented.

The lookup function would have to be implemented by users, so users would have to decide whether to override the old value (i.e. start scanning from the end) or ignore the new value (i.e. start scanning from the beginning).

The current version uses a map, and by overriding the old value when merging it effectively implements the behavior in the old code when the user implements lookup as scan from the end.

protobuf spec does not describe merging, so we have to decide how to implement this. As a user, I would probably expect the values to be merged. So we have three options:

  1. Ignore duplicate keys when merging (do not change existing key values)
  2. Override old values when merging (current implementation)
  3. Merge values (what this issue originally suggested)

As far as I understand all of these are valid in terms of spec compliance.