protocolbuffers / protobuf

Protocol Buffers - Google's data interchange format
http://protobuf.dev
Other
65.11k stars 15.43k forks source link

FieldMask MergeMessage incorrect for oneof fields in Python #3789

Closed samiur closed 6 years ago

samiur commented 6 years ago

When using the FieldMask WKT in python, it doesn't properly merge when subfields are specified for OneOf field members. Example proto:

message A {
    uint32 id = 1;
}

message B {
    uint32 id = 1;
}

message C {
    oneof obj {
        A a = 1;
        B b = 2;
    }
}

If we using the following FieldMask:

fm = FieldMask(paths=['a.id', 'b.id'])
msg = C(a=A(id=1))

new_msg = C()

fm.MergeMessage(msg, new_msg)

instead of new_msg being {'a': {'id': 1}}, it is {'b': {}}

samiur commented 6 years ago

Looking through the code, the problem seems to be in _MergeMessage. Since it iterates through all the fields of a source Message, it ends up "setting" default values for B.id after setting A.id. This in turn makes the oneof field think that b is the correct oneof field as it is set later. A potential fix could be just adding a HasField check into _MergeMessage like the following:

def _MergeMessage(
    node, source, destination, replace_message, replace_repeated):
  """Merge all fields specified by a sub-tree from source to destination."""
  source_descriptor = source.DESCRIPTOR
  for name in node:
    child = node[name]
    field = source_descriptor.fields_by_name[name]
    if field is None:
      raise ValueError('Error: Can\'t find field {0} in message {1}.'.format(
          name, source_descriptor.full_name))
    if child:
      # Sub-paths are only allowed for singular message fields.
      if (field.label == FieldDescriptor.LABEL_REPEATED or
          field.cpp_type != FieldDescriptor.CPPTYPE_MESSAGE):
        raise ValueError('Error: Field {0} in message {1} is not a singular '
                         'message field and cannot have sub-fields.'.format(
                             name, source_descriptor.full_name))
      if source.HasField(name):  # THIS IS THE CHECK I'M PROPOSING
        _MergeMessage(
          child, getattr(source, name), getattr(destination, name),
          replace_message, replace_repeated)
      continue
    if field.label == FieldDescriptor.LABEL_REPEATED:
      if replace_repeated:
        destination.ClearField(_StrConvert(name))
      repeated_source = getattr(source, name)
      repeated_destination = getattr(destination, name)
      if field.cpp_type == FieldDescriptor.CPPTYPE_MESSAGE:
        for item in repeated_source:
          repeated_destination.add().MergeFrom(item)
      else:
        repeated_destination.extend(repeated_source)
    else:
      if field.cpp_type == FieldDescriptor.CPPTYPE_MESSAGE:
        if replace_message:
          destination.ClearField(_StrConvert(name))
        if source.HasField(name):
          getattr(destination, name).MergeFrom(getattr(source, name))
      else:
        setattr(destination, name, getattr(source, name))

I can create a PR for this if you need.

anandolee commented 6 years ago

Yes, you can go ahead create a PR and assign to me. Thanks

anandolee commented 6 years ago

ping @samiur , are you going to provide a PR?

anandolee commented 6 years ago

Fixed in internal code. Expect to see the fix in next release