aip-dev / google.aip.dev

API Improvement Proposals. https://aip.dev/
Other
1.07k stars 487 forks source link

Clarify Expectations for UpdateRequests with Optional Field Masks #1013

Open Jseph opened 1 year ago

Jseph commented 1 year ago

There are 2 issues here:

  1. The current guidance leaves some room for misunderstanding how to generate implied field masks
  2. Inconsistency for presence checks between API implementations makes this hard to standardize

Ambiguity in current guidance

There's multiple ways the statement If optional, the service *must* treat an omitted field mask as an implied field mask equivalent to all fields that are populated can be interpreted. For example with the message

message Resource {
  message NestedMessage {
    int a = 1;
    int b = 2;
  }
  NestedMessage nested_message = 1;
}

Let's say we are storing:

Resource {
  nested_message = {
    a = 10;
    b = 10;
  }
}

If someone sent

UpdateResourceRequest {
  nested_message = {
    a = 1337;
  }
}

It would be reasonable to interpret the guidance to mean the field mask should contain either nested_message (because this field is set) or nested_message.a. For backwards compatibility it's important that the latter is used. It would be good to clarify to remove the possibility of confusion here.

Problems with implementing and using APIs with optional field masks

The concept of set to default vs unset fields is nebulous. It depends on proto2 vs proto3 as well as the request serialization libraries. This means it's hard to tell if something like

UpdateResourceRequest {
  nested_message = {
    a = 0;
  }
}

should change the value or be a no-op. For this reason I think we should make a recommendation to make the field mask required rather than remaining neutral on the topic.

seaneganx commented 1 year ago

Searched for any relevant issues and found #673.

It would be reasonable to interpret the guidance to mean the field mask should contain either nested_message (because this field is set) or nested_message.a. For backwards compatibility it's important that the latter is used. It would be good to clarify to remove the possibility of confusion here.

Proposal: If optional, the service *must* treat an omitted field mask as an implied field mask equivalent to all leaf fields that are populated (or instead of introducing the term "leaf" we can just add a sentence explaining that).

For this reason I think we should make a recommendation to make the field mask required rather than remaining neutral on the topic.

I don't think it needs to be required, but I also agree that the AIPs shouldn't be as neutral about it as they are. I see two good options available in between neutral and strict guidance.

I don't see any significant downside to the warning. It seems like valuable information that's often overlooked by API designers. Whether to say update masks should be required is a more significant change and isn't necessarily needed.


What about the field mask proto?

It suggests that sub-messages should be "merged" even if the field mask directly targets the sub-message. This "merge" seems equivalent to what the AIPs refer to as an "implied field mask" - only updating the fields that aren't set to the default. IMO this is good.

// If a sub-message is specified in the last position of the field mask for an
// update operation, then new value will be merged into the existing sub-message
// in the target resource.
//
// For example, given the target message:
//
//     f {
//       b {
//         d: 1
//         x: 2
//       }
//       c: [1]
//     }
//
// And an update message:
//
//     f {
//       b {
//         d: 10
//       }
//       c: [2]
//     }
//
// then if the field mask is:
//
//  paths: ["f.b", "f.c"]
//
// then the result will be:
//
//     f {
//       b {
//         d: 10
//         x: 2
//       }
//       c: [1, 2]
//     }

It also suggests that fields can't be reset to their default value without directly specifying them in the field mask or performing a full replacement. IMO this is good.

// In order to reset a field's value to the default, the field must
// be in the mask and set to the default value in the provided resource.
// Hence, in order to reset all fields of a resource, provide a default
// instance of the resource and set all fields in the mask, or do
// not provide a mask as described below.

Finally it contradicts AIP-134's guidance about the behavior of an unspecified update mask by saying that will cause a full replacement, and notes the consequences for backward compatibility. IMO this is bad: defaulting to the "merge" behavior would avoid the problem.

// If a field mask is not present on update, the operation applies to
// all fields (as if a field mask of all fields has been specified).
// Note that in the presence of schema evolution, this may mean that
// fields the client does not know and has therefore not filled into
// the request will be reset to their default. If this is unwanted
// behavior, a specific service may require a client to always specify
// a field mask, producing an error if not.

Considering all the information we've gathered in this issue so far, my stance is that "implied field masks" or "merging" is perfectly fine for APIs to support, and should be allowed by the AIPs.

I think it would be valuable for the AIPs to have a note about the consequences of full-replacement, including full replacement of sub-messages.

I think it's especially unintuitive and worth noting that sub-messages should be merged, not fully replaced, when the field mask includes the sub-message path.

I think the field mask proto should be updated to align with the AIP guidance, if possible.