golang / protobuf

Go support for Google's protocol buffers
BSD 3-Clause "New" or "Revised" License
9.64k stars 1.58k forks source link

encoding/protojson: handling google.protobuf.Empty #1620

Open holycheater opened 3 weeks ago

holycheater commented 3 weeks ago

Version: v1.34.1

Hi, I've encountered interop problems with protojson encoding between different languages

The following code produces error in golang:

anyResult := &anypb.Any{}
data := []byte(`{"@type":"type.googleapis.com/google.protobuf.Empty"}`)
err = protojson.Unmarshal(data, anyResult)
fmt.Println(err)

Output:

proto: (line 1:53): missing "value" field

Yet, this snippet in Java:

@Test
void test() throws InvalidProtocolBufferException {
    final var typeRegistry = TypeRegistry.newBuilder()
        .add(Empty.getDescriptor())
        .build();

    final var protoMessageAny = Any.pack(Empty.getDefaultInstance());

    final var jsonString = JsonFormat.printer()
        .preservingProtoFieldNames()
        .includingDefaultValueFields()
        .omittingInsignificantWhitespace()
        .usingTypeRegistry(typeRegistry)
        .print(protoMessageAny);

    System.out.println(jsonString);
}

produces output:

{"@type":"type.googleapis.com/google.protobuf.Empty"}

python, same thing:

from google.protobuf.any_pb2 import Any
from google.protobuf.empty_pb2 import Empty
from google.protobuf.json_format import MessageToJson

a = Any()
a.Pack(Empty())
print(MessageToJson(a))

Output:

{
  "@type": "type.googleapis.com/google.protobuf.Empty"
}

It doesn't seem golang implementation should expect value field

holycheater commented 3 weeks ago

https://github.com/golang/protobuf/issues/759#issuecomment-593793968 Found this issue, says it was fixed, but the same behaviour reproduces on google.golang.org/protobuf@v1.20.0

puellanivis commented 3 weeks ago

Verified, v1.20.0 still reports missing "value" field as does the current v1.34.1.

cybrcodr commented 3 weeks ago

This is the same as https://github.com/golang/protobuf/issues/759. Someone responded with https://github.com/protocolbuffers/protobuf/issues/5390#issuecomment-476424326. I'm uncertain if that is affirmative as there has been no further action/decision and the issue was simply closed as inactive. May want to reopen that issue.

neild commented 3 weeks ago

I think this is the other side of #759: I don't know if we should marshal an Empty with a value of {} or no value, but we definitely should parse what other languages are producing.

dsnet commented 3 weeks ago

The documentation for google.protobuf.Any.value says:

[The value field] must be a valid serialized protocol buffer of the above specified type.

Elsewhere, the documentation for google.protobuf.Any says:

If the embedded message type is well-known and has a custom JSON representation, that representation will be embedded adding a field value which holds the custom JSON in addition to the @type field.

Then, the documentation for google.protobuf.Empty says:

The JSON representation for Empty is empty JSON object {}.

Thus, it seems that the current behavior is correct.

That said, the protobuf ecosystem is full of inconsistencies where the implementations do not follow what is documented. Thus, I agree with what @neild says. Pragmatically we should just do whatever the other languages do.

puellanivis commented 3 weeks ago

I’m on the same page as dsnet. Technically, we’re following the spec as written… but that doesn’t much matter if other implementations are accepting this and we’re not.

lfolger commented 3 weeks ago

I filed https://github.com/protocolbuffers/protobuf/issues/17099 as a first step to get clarity on this. I'm not even sure if we can change the Go behavior if we wanted to because it would be a backwards incompatible change. However, accepting this might be better than the incosistency with C++ and Java and the inability to parse their output.

holycheater commented 3 weeks ago

Changing behaviour of golang protojson.Unmarshal can be a viable compromise, otherwise compatibility would break.

neild commented 3 weeks ago

I don't see an urgent reason to change Marshal's behavior. However, Unmarshal must be able to parse the output of other languages, even if that output is technically wrong. (I have no opinion on whether it is technically wrong or not.)

puellanivis commented 3 weeks ago

I definitely wasn’t advocating for changing the marshalling output. Only for accepting.

lfolger commented 3 weeks ago

We still end up with the problem that the output generated by Go cannot be parsed by the other languages. But I agree that extending the unmarshalling makes it more permissive and thus might not be fine (unless someone depends on an error being reported).

dsnet commented 3 weeks ago

output generated by Go cannot be parsed by the other languages

Perhaps I missed it, but the original post only mentioned protojson.Unmarshal. Is there a report of whats generated by protojson.Marshal not being accepted in another language implementation?

lfolger commented 3 weeks ago

I only tested it with C++, but C++ doesn't accept: {"@type":"type.googleapis.com/google.protobuf.Empty","value":{}}

It fails with

invalid JSON in   google.protobuf.Any  @ <any>: message google.protobuf.Empty, near 1:62  (offset 61):   no   such  field:   'value'
dsnet commented 3 weeks ago

Ah, thanks for looking into that.

Given the amount of inconsistency in the ecosytem, I suspect every implementation will need to accept both the case where it is preset with {} and also missing. That said, I think it's up to the protobuf team to declare which form is "more correct" and we should probably output that.

cybrcodr commented 3 weeks ago

https://github.com/golang/protobuf/issues/759 is the report for V1's marshaling issue. With the response in https://github.com/protocolbuffers/protobuf/issues/5390#issuecomment-476424326, I think we decided to keep the behavior for V2.

I was mistaken in my comment above that this is similar to that marshaling issue. I don't mind having the unmarshaling logic accept what the other languages output.