protocolbuffers / protobuf

Protocol Buffers - Google's data interchange format
http://protobuf.dev
Other
65.21k stars 15.44k forks source link

proto3 JSON spec breaks backwards compatibility expectations #5432

Closed texuf closed 5 years ago

texuf commented 5 years ago

I'm sorry if this was already discussed. I'm filing as a bug and not a feature request because I believe this behavior breaks sane expectations.

My issue is with the json options specification here: https://developers.google.com/protocol-buffers/docs/proto3#json_options

Ignore unknown fields: Proto3 JSON parser should reject unknown fields by default but may provide an option to ignore unknown fields in parsing.

With this specification, updating any protobuf message will break old clients if they use the default json implementation.

What version of protobuf and what language are you using? Version: master/v3.6.0/v3.5.0 etc. Language: Swift

What operating system (Linux, Windows, ...) and version? iOS

What runtime / compiler are you using (e.g., python version or gcc version) N/A

What did you do? Steps to reproduce the behavior:

  1. Create any "Message" protobuf

    message CartItem {
    string id = 0;
    string name = 1;
    }
  2. Write code parse the message from json using default options. Ship this code to millions of clients

       guard let protobuf = try? CartItem(jsonString: json) else {
            return nil
        }
  3. Update the message

    message CartItem {
    string id = 0;
    string name = 1;
    string description = 2;
    string analytics_id = 3;
    }
  4. Millions of clients fail to parse cart items with the error

    ▿ JSONDecodingError
    - unknownField : "description"
  5. Patch clients to use

        var options = JSONDecodingOptions()
        options.ignoreUnknownFields = true
        guard let protobuf = try? CartItem(jsonString: json, options: options) else {
            return nil
        }
  6. New programmer writes new code that uses the default implementation, and the cycle repeats itself forever.

What did you expect to see I expected that the default implementation wouldn't fail when I updated my Message, and backwards compatibility would be preserved.

What did you see instead? Not what I expected.

thomasvl commented 5 years ago

It should be noted that dropping unknown fields means if the message received is sent back out, it has fundamentally changed because those unknown fields are gone. So enabling the option is dangerous.

The binary format does not have these sorta issues because the binary data can be easily preserved and reflected out.

I believe the intent with JSON was to better deal with these sorta issues, FieldMasks should be used to require the specific data clients want, and also during mutations, include a field mask to say what fields you are specifically changing. That way the fact that a client/server doesn't know about a field is less of an issue, because it won't be include in the masks it requests/sends, letting the other end know that it is not explicitly removing the field.

texuf commented 5 years ago

That’s a really good point, and a use case that I wasn’t thinking about.

We mostly use binary representation, so mostly this isn’t an issue, but we do use Json for things like embedding data into apns notifications.

I still think that in my use case, where the server is generating protobufs, and many clients of many different versions are parsing them, this feels very broken.

TeBoring commented 5 years ago

FieldMask seems the right answer to that. Supporting keeping unknown in json is impossible for us.

texuf commented 5 years ago

I wasn't suggesting keeping unknown in json, just not throwing an exception if an unknown field exists. Thanks for listening, I will file this under "things you just have to know" when working with protobufs.

TeBoring commented 5 years ago

In several languages, we do allow giving "ignore_unknown" option to json decoding method.

TeBoring commented 5 years ago

https://developers.google.com/protocol-buffers/docs/proto3#json Search "Ignore unknown fields"

texuf commented 5 years ago

Yup! I've added that to every place where we're not using the binary representation, and will hopefully remember to do so in the future. Thank you.

On Mon, Dec 17, 2018 at 4:26 PM Paul Yang notifications@github.com wrote:

In several languages, we do allow giving "ignore_unknown" option to json decoding method.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/protocolbuffers/protobuf/issues/5432#issuecomment-448050331, or mute the thread https://github.com/notifications/unsubscribe-auth/AA6B2UugB4Ad9MO0hLY1WqWXkd9cZ-dxks5u6DZNgaJpZM4ZHOC8 .