golang / protobuf

Go support for Google's protocol buffers
BSD 3-Clause "New" or "Revised" License
9.79k stars 1.58k forks source link

encoding/protojson: handling google.protobuf.Empty #1620

Open holycheater opened 5 months ago

holycheater commented 5 months ago

Version: v1.34.1

Hi, I've encountered interop problems with protojson encoding between different languages

The following code produces error in golang:

anyResult := &anypb.Any{}
data := []byte(`{"@type":"type.googleapis.com/google.protobuf.Empty"}`)
err = protojson.Unmarshal(data, anyResult)
fmt.Println(err)

Output:

proto: (line 1:53): missing "value" field

Yet, this snippet in Java:

@Test
void test() throws InvalidProtocolBufferException {
    final var typeRegistry = TypeRegistry.newBuilder()
        .add(Empty.getDescriptor())
        .build();

    final var protoMessageAny = Any.pack(Empty.getDefaultInstance());

    final var jsonString = JsonFormat.printer()
        .preservingProtoFieldNames()
        .includingDefaultValueFields()
        .omittingInsignificantWhitespace()
        .usingTypeRegistry(typeRegistry)
        .print(protoMessageAny);

    System.out.println(jsonString);
}

produces output:

{"@type":"type.googleapis.com/google.protobuf.Empty"}

python, same thing:

from google.protobuf.any_pb2 import Any
from google.protobuf.empty_pb2 import Empty
from google.protobuf.json_format import MessageToJson

a = Any()
a.Pack(Empty())
print(MessageToJson(a))

Output:

{
  "@type": "type.googleapis.com/google.protobuf.Empty"
}

It doesn't seem golang implementation should expect value field

holycheater commented 5 months ago

https://github.com/golang/protobuf/issues/759#issuecomment-593793968 Found this issue, says it was fixed, but the same behaviour reproduces on google.golang.org/protobuf@v1.20.0

puellanivis commented 5 months ago

Verified, v1.20.0 still reports missing "value" field as does the current v1.34.1.

cybrcodr commented 5 months ago

This is the same as https://github.com/golang/protobuf/issues/759. Someone responded with https://github.com/protocolbuffers/protobuf/issues/5390#issuecomment-476424326. I'm uncertain if that is affirmative as there has been no further action/decision and the issue was simply closed as inactive. May want to reopen that issue.

neild commented 5 months ago

I think this is the other side of #759: I don't know if we should marshal an Empty with a value of {} or no value, but we definitely should parse what other languages are producing.

dsnet commented 5 months ago

The documentation for google.protobuf.Any.value says:

[The value field] must be a valid serialized protocol buffer of the above specified type.

Elsewhere, the documentation for google.protobuf.Any says:

If the embedded message type is well-known and has a custom JSON representation, that representation will be embedded adding a field value which holds the custom JSON in addition to the @type field.

Then, the documentation for google.protobuf.Empty says:

The JSON representation for Empty is empty JSON object {}.

Thus, it seems that the current behavior is correct.

That said, the protobuf ecosystem is full of inconsistencies where the implementations do not follow what is documented. Thus, I agree with what @neild says. Pragmatically we should just do whatever the other languages do.

puellanivis commented 5 months ago

I’m on the same page as dsnet. Technically, we’re following the spec as written… but that doesn’t much matter if other implementations are accepting this and we’re not.

lfolger commented 5 months ago

I filed https://github.com/protocolbuffers/protobuf/issues/17099 as a first step to get clarity on this. I'm not even sure if we can change the Go behavior if we wanted to because it would be a backwards incompatible change. However, accepting this might be better than the incosistency with C++ and Java and the inability to parse their output.

holycheater commented 5 months ago

Changing behaviour of golang protojson.Unmarshal can be a viable compromise, otherwise compatibility would break.

neild commented 5 months ago

I don't see an urgent reason to change Marshal's behavior. However, Unmarshal must be able to parse the output of other languages, even if that output is technically wrong. (I have no opinion on whether it is technically wrong or not.)

puellanivis commented 5 months ago

I definitely wasn’t advocating for changing the marshalling output. Only for accepting.

lfolger commented 5 months ago

We still end up with the problem that the output generated by Go cannot be parsed by the other languages. But I agree that extending the unmarshalling makes it more permissive and thus might not be fine (unless someone depends on an error being reported).

dsnet commented 5 months ago

output generated by Go cannot be parsed by the other languages

Perhaps I missed it, but the original post only mentioned protojson.Unmarshal. Is there a report of whats generated by protojson.Marshal not being accepted in another language implementation?

lfolger commented 5 months ago

I only tested it with C++, but C++ doesn't accept: {"@type":"type.googleapis.com/google.protobuf.Empty","value":{}}

It fails with

invalid JSON in   google.protobuf.Any  @ <any>: message google.protobuf.Empty, near 1:62  (offset 61):   no   such  field:   'value'
dsnet commented 5 months ago

Ah, thanks for looking into that.

Given the amount of inconsistency in the ecosytem, I suspect every implementation will need to accept both the case where it is preset with {} and also missing. That said, I think it's up to the protobuf team to declare which form is "more correct" and we should probably output that.

cybrcodr commented 5 months ago

https://github.com/golang/protobuf/issues/759 is the report for V1's marshaling issue. With the response in https://github.com/protocolbuffers/protobuf/issues/5390#issuecomment-476424326, I think we decided to keep the behavior for V2.

I was mistaken in my comment above that this is similar to that marshaling issue. I don't mind having the unmarshaling logic accept what the other languages output.

justinsb commented 1 month ago

I also encountered this issue when using the GCP go SDK (I wonder if this is the most common scenario?). Google LROs use anypb.Any for the result field, and there are multiple APIs service which do not send the value field when the result is emptypb.Empty, triggering this problem. One mitigating factor with the GCP go SDK is that it only applies to protojson, and the GCP go SDK seems to default to true proto encoding now for most services.

However for protojson, the GCP Go SDK does pass both AllowPartial and DiscardUnknown. I would think that AllowPartial=true should mean that a missing value should not be an error. I think what has happened is that we create a new UnmarshalOptions in protojson and do not pass down AllowPartial. I proposed a CL to pass down AllowPartial and not return an error if value is not set. For me, this seems to fix the problem with GCP Go SDK, and (I would argue) is a bug fix rather than a behavioral change - so perhaps easier to justify.

I don't know if we should also pass down DiscardUnknown. I would think yes, from the perspective of obeying the user's intent (and then maybe we should pass down all the options?). But also because I think DiscardUnknown is important - we presumably do not want to fail when the service adds new fields. Because this doesn't happen all the time, I'm guessing that this is only relevant for a new field in anypb.Any, so maybe this isn't very likely, but it still seems like good future-proofing.

The CL is https://go-review.git.corp.google.com/c/protobuf/+/617516

puellanivis commented 1 month ago

My primary concern here would be that if the AllowPartial is permitted in an anypb.Any, wouldn’t this then permit missing values for something more serious like a structpb.Value?

The problem here is wholly only with wrapping the emptypb.Empty type, and we should probably be allowing this missing value field for that type regardless of what AllowPartial is.

justinsb commented 1 month ago

My primary concern here would be that if the AllowPartial is permitted in an anypb.Any, wouldn’t this then permit missing values for something more serious like a structpb.Value?

Yes - AllowPartial might require clients to do extra validation. But this change is adding/fixing support for AllowPartial, i.e. if the user passes AllowPartial, then we honor that. There is also an issue today (I would argue) that we don't honor the user's AllowPartial instruction for this case.

(Aside: is AllowPartial actually a bad idea in the context of proto now recommending all fields be optional? e.g. I don't think structpb.Value has any required fields, though I am by no means the proto expert!)

The problem here is wholly only with wrapping the emptypb.Empty type, and we should probably be allowing this missing value field for that type regardless of what AllowPartial is.

Now that you mention it, I do agree as the value field is not marked as required AFAICT: https://github.com/protocolbuffers/protobuf/blob/64e14b42c4255c5e2759f795ee4bcf1d17b00ae5/src/google/protobuf/any.proto#L161

If we think this change is safe, happy to submit a CL for this or test a CL if that is useful.

(My personal interest here is in being able to handle these "missing value" LROs in the Google Cloud go SDK, beyond that I'm amenable to whatever approach you think is best)

puellanivis commented 1 month ago

we don't honor the user's AllowPartial instruction for this case.

We don’t honor it for good reason, it’s a well-known type, and thus has a canonical representation in JSON:

If the Any contains a value that has a special JSON mapping, it will be converted as follows: {"@type": xxx, "value": yyy}. Otherwise, the value will be converted into a JSON object, and the "@type" field will be inserted to indicate the actual data type.

The problem is that arguably not emitting the {} for the Empty WKT, in the Any is a violation of the standards, and shouldn’t be produced by any of the official protobuf libraries. But it seems others have read the same standard and come to a different decision on this specific situation. That’s why we raised the issue in the upstream mainline protobuf project. Unfortunately, it seems it hasn’t received any attention. 😢

But since other of the Big4 generators are marshalling and unmarshalling it, even when AllowPartial is disabled, and this is the behavior that would need to be replicated. It’s the whole reason we separated off the json.UnmarshalOptions and didn’t copy over the AllowPartial in this case.

PS:

Now that you mention it, I do agree as the value field is not marked as required

This has nothing to do with anything, because it’s covered under the WKTs explicitly on the JSON mapping standard linked to above. It’s required, because the standard for Any requires it.

neild commented 1 month ago

I don't think AllowPartial is the right solution here. AllowPartial disables required field checks. The problem here is not a missing required field, it's a disagreement on what a valid representation of an Any containing an Empty is. If we start making AllowPartial affect the parsing of Any, we're asking people to set an option that universally affects required fields to work around a bug. That's confusing, and will have bad knock-on effects for many years to come.

We should change the Go JSON parser to accept {"@type":"type.googleapis.com/google.protobuf.Empty"} as a valid Any containing an Empty.

Every parser that doesn't accept {"@type":"type.googleapis.com/google.protobuf.Empty","value":{}} as a valid Any containing an Empty should be changed to do so.

The protobuf project maintainers should decide which form is the correct one, and we should eventually change all encoders to produce it. Although that's going to be tricky to do without breaking anything, so we'd probably want to wait some period of time after aligning parsers to handle either form.

We should not change AllowPartial to affect the parsing of Any.

justinsb commented 1 month ago

Fair enough :-)

We should change the Go JSON parser to accept {"@type":"type.googleapis.com/google.protobuf.Empty"} as a valid Any containing an Empty.

I'll send a version that does this.

Edit: https://go-review.googlesource.com/c/protobuf/+/618395

neild commented 1 month ago

https://go.dev/cl/618395 changes the Go JSON unmarshaler to accept the output produced by C++/Java, but C++ (and I think Java) still don't accept the output produced by Go. Fully fixing this issue requires addressing that.

Just changing the Go marshaler isn't going to work, since we'd start producing output that can't be parsed by earlier versions of protojson. Addressing the issue will require coordinating with the C++/Java maintainers (at least). Perhaps that should be a new issue, but I'm going to reopen this one for now so we don't lose track of it.

dsnet commented 1 month ago

This edge case should also be added the conformance test suite to ensure that future implementations agree with each other.

orloffv commented 1 month ago

@justinsb thank you so much for the fix. Can you make a release with this version?

chressie commented 3 weeks ago

@orloffv sorry for the slow reply. We plan to do a release in 1-2 weeks. If you need the fix sooner, i think it's fine to just update directly to fb995f1 for the time being (e.g. go get google.golang.org/protobuf@fb995f1). Hope that helps.

orloffv commented 1 week ago

@chressie Can you make a release with this fix?

chressie commented 1 week ago

@orloffv we just cut a new release https://github.com/protocolbuffers/protobuf-go/releases/tag/v1.35.2, which includes the change.