golang / protobuf

Go support for Google's protocol buffers
BSD 3-Clause "New" or "Revised" License
9.77k stars 1.58k forks source link

add proto.IsWellFormed helper function to validate that a message value is well-formed #1634

Open jhump opened 3 months ago

jhump commented 3 months ago

By "well-formed", I mean the message contains none of the following "malformed" constructs. I use the them "malformed" because they are not valid Protobuf constructs and can't be encoded in the binary format: serializing and deserializing the message results in a new message where proto.Equal returns false. All of the "malformed" constructs have to do with typed nil messages, so they are also nil-dereference-panic risks in user code.

  1. A slice (i.e. repeated field) that contains a typed nil message.
  2. A map that contains a typed nil message value.
  3. A oneof wrapper struct that contains a typed nil message value.
  4. A oneof wrapper struct that is itself a typed nil pointer. (This is related to #1415.)

(The lattermost one does not actually impact round-trip equality: a typed nil oneof wrapper struct is treated by reflection and serialization the same as a nil interface value. However, it still is a nil-dereference-panic risk in user code.)

Although all of these cases are handled by the runtime library and (I think) their expected behavior is documented, the behavior can be surprising since it isn't necessarily intuitive (usually for users new to Protobuf that don't understand how the wire format often prevents "intuitive" behavior). I suspect that such cases in practice are usually inadvertent and a symptom of an underlying defecct in the code that constructed the message. To that end, it would be useful if a validation layer could easily detect such malformed messages.

This could potentially be provided by a third party package *, but it would be much less discoverable and thus less likely to be used. Having it in the proto package would make it more useful in that regard. Note that the fourth construct above cannot actually be detected by a 3rd party library, at least not cleanly, because reflection provides no way to distinguish between a struct field that corresponds to a oneof being set to a nil interface value vs. a typed nil pointer (also pointed out in this comment).

neild commented 3 months ago

A repeated field, map value, or oneof field containing a nil message is equal to one containing a non-nil, empty one, so the behavior for cases 1, 2, and 3 is not as described.

A nil message pointer is, in general, exactly equivalent to a non-nil pointer to an empty message, except for being read-only. proto.Equal(a, b) reports false if a==nil != b==nil, for historical reasons, but that only applies to the top-level values, not anything contained within them.