golang / protobuf

Go support for Google's protocol buffers
BSD 3-Clause "New" or "Revised" License
9.79k stars 1.58k forks source link

Why does deserialize []byte("xxxx") into a proto structure work? #1641

Closed hedon954 closed 3 months ago

hedon954 commented 3 months ago

What version of protobuf and what language are you using?

What did you do?

I've implemented a function to unmarshal byte slices into protobuf structures as shown below:

func FromProto[T any](bs []byte) (*T, error) {
    if len(bs) == 0 {
        return nil, errors.New("protobuf data is empty")
    }
    var t T
    msg, ok := any(&t).(proto.Message)
    if !ok {
        return nil, fmt.Errorf("type %T does not implement proto.Message", t)
    }
    err := proto.Unmarshal(bs, msg)
    if err != nil {
        return nil, err
    }
    return &t, nil
}

To verify the behavior of this function, I wrote some unit tests. Here are two cases, one of which behaves unexpectedly:

func TestFromProto(t *testing.T) {
        // This test behaves as expected, leading to an error
    t.Run("invalid proto should fail", func(t *testing.T) {
        data, err := FromProto[fixtures.Request]([]byte("hello"))
        assert.NotNil(t, err)
        assert.Nil(t, data)
    })

        // This test does not behave as expected; no error is returned
    t.Run("invalid proto should fail", func(t *testing.T) {
        data, err := FromProto[fixtures.Request]([]byte("xxxx"))
        assert.Nil(t, err)
        assert.NotNil(t, data)
    })
}

Here's the definition of fixtures.Request used in the tests:

syntax = "proto3";

option go_package = "fixtures";

message Request {
  string name = 1;
  int32 age = 2;
  string address = 3;
  bool married = 4;
}

What did you expect to see?

I expected that unmarshaling []byte("xxxx") into a protobuf structure would fail in a similar manner to unmarshaling []byte("hello"), given that both inputs are not valid protobuf serialized forms of Request.

What did you see instead?

Unexpectedly, the unmarshaling of []byte("xxxx") did not result in an error. Instead, the function returned a structure with unknownFields. This behavior is inconsistent and can lead to confusion among developers.

puellanivis commented 3 months ago

The string "xxxx" is not invalid protobuf, since x indicates field 15 | type varint and x as a value indicates 120.

One can get surprisingly long specifically crafted strings that are surprisingly valid Protobuf one I crafted before was: "*myFooIsStrangeAndWeirdButNotQuiteBroken...".

PS: if this wasn’t clear, this is “working as intended”, since it is technically validly encoded protobuf.

hedon954 commented 3 months ago

The string "xxxx" is not invalid protobuf, since x indicates field 15 | type varint and x as a value indicates 120.

One can get surprisingly long specifically crafted strings that are surprisingly valid Protobuf one I crafted before was: "*myFooIsStrangeAndWeirdButNotQuiteBroken...".

PS: if this wasn’t clear, this is “working as intended”, since it is technically validly encoded protobuf.

Interesting! Thanks for your reply! 😄